1 Objectives & Background

This is a RMarkdown document that will be used for the 20230409 morning sessions to go through together. Some of the objectives are:

  • Practical understanding of setting up a clean RStudio environment
  • Getting the habit of documenting necessary information in RMarkdown
  • KnitR to generate documents.
  • Inspecting the count matrices generated from cellranger
  • Extracing gene-level meta information
  • Working through Seurat standard clustering pipeline
    • Quality & Control

2 RStudio Preparations

We assume that you have installed the latest R on your laptop (currently R 4.2.3), and also updated to the latest RStudio (in my case it is 2023.03.0+386 (2023.03.0+386)).

The following code ensures that the packages that I am installing are placed on a defined directory

.libPaths("~/R_xenopus")
.libPaths()
## [1] "/Users/chlee/R_xenopus"                                        
## [2] "/Library/Frameworks/R.framework/Versions/4.2/Resources/library"

The following code installs Bioconductor package manager. eval=FALSE ensures that it does not run two times during RMarkdown generation.

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.16")

Now let us install Seurat, but one of the strength of R comes from sets of packages developed by the RStudio group: tidyverse, so let’s install this as well (you may have it installed already). And I want to add one more small package tictoc that is handy in measuring how long it took to run a patch of code.

During this installation run, which will take few minutes, it asks whether igraph package should be compiled in the system. At least in Mac OSX (on 2023-04-07), this fails, so so do not compile igraph but just use the older pre-compiled version instead.

BiocManager::install(c("tidyverse", "Seurat", "tictoc", "devtools") )
# See this: https://github.com/Toniiiio/imageclipr
# devtools::install_github('Timag/imageclipr')

The following code ensures that the packages are all up-to-date. Note that igraph package is out of date, but this is OK, leave it.

BiocManager::valid()
## Warning: 1 packages out-of-date; 0 packages too new
## 
## * sessionInfo()
## 
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.31       R6_2.5.1            jsonlite_1.8.4     
##  [4] evaluate_0.20       cachem_1.0.7        rlang_1.1.0        
##  [7] cli_3.6.1           rstudioapi_0.14     jquerylib_0.1.4    
## [10] bslib_0.4.2         rmarkdown_2.21      tools_4.2.3        
## [13] xfun_0.38           yaml_2.3.7          fastmap_1.1.1      
## [16] compiler_4.2.3      BiocManager_1.30.20 htmltools_0.5.5    
## [19] knitr_1.42          sass_0.4.5         
## 
## Bioconductor version '3.16'
## 
##   * 1 packages out-of-date
##   * 0 packages too new
## 
## create a valid installation with
## 
##   BiocManager::install("igraph", update = TRUE, ask = FALSE, force = TRUE)
## 
## more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

The above BiocManager::valid() run already ran sessionInfo() but for all R runs, please include this for reproducibility purposes. This lists all the R packages installed in the system (as directed by .libPaths()) with all the versions, so you can track for any issues of reproducibility here.

sessionInfo()
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.31       R6_2.5.1            jsonlite_1.8.4     
##  [4] evaluate_0.20       cachem_1.0.7        rlang_1.1.0        
##  [7] cli_3.6.1           rstudioapi_0.14     jquerylib_0.1.4    
## [10] bslib_0.4.2         rmarkdown_2.21      tools_4.2.3        
## [13] xfun_0.38           yaml_2.3.7          fastmap_1.1.1      
## [16] compiler_4.2.3      BiocManager_1.30.20 htmltools_0.5.5    
## [19] knitr_1.42          sass_0.4.5

Double check where you are. This gives you a sense when you want to use relative URLs later:

getwd() # Usually the Document directory
## [1] "/Users/chlee/Dropbox (HMS)/tabinLab/presentation/20230408(XenopusBioinfo2023)/20230409"
here::here() # Usually the project directory
## [1] "/Users/chlee/Dropbox (HMS)/tabinLab/presentation/20230408(XenopusBioinfo2023)"

(It should be the project folder on your top right corner of RStudio)

3 Load libraries and other document-wide parameters

First you load the libraries. These commands will let you use their open functions without invoking the package names. Also it might be useful to set up a project prefix, such that all the intermediate files can be tracked more efficiently.

library(tidyverse) # we mostly use dplyr library
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.1     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Seurat)
## Attaching SeuratObject
library(patchwork)

theme_set( theme_bw() )

root.dir <- here::here()

"%ni%" <- Negate("%in%")
project.prefix <- "20230409_"

sessionInfo()
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.1.2    SeuratObject_4.1.3 Seurat_4.3.0       lubridate_1.9.2   
##  [5] forcats_1.0.0      stringr_1.5.0      dplyr_1.1.1        purrr_1.0.1       
##  [9] readr_2.1.4        tidyr_1.3.0        tibble_3.2.1       ggplot2_3.4.2     
## [13] tidyverse_2.0.0   
## 
## loaded via a namespace (and not attached):
##   [1] Rtsne_0.16             colorspace_2.1-0       deldir_1.0-6          
##   [4] ellipsis_0.3.2         ggridges_0.5.4         rprojroot_2.0.3       
##   [7] spatstat.data_3.0-1    rstudioapi_0.14        leiden_0.4.3          
##  [10] listenv_0.9.0          ggrepel_0.9.3          fansi_1.0.4           
##  [13] codetools_0.2-19       splines_4.2.3          cachem_1.0.7          
##  [16] knitr_1.42             polyclip_1.10-4        jsonlite_1.8.4        
##  [19] ica_1.0-3              cluster_2.1.4          png_0.1-8             
##  [22] uwot_0.1.14            spatstat.sparse_3.0-1  shiny_1.7.4           
##  [25] sctransform_0.3.5      BiocManager_1.30.20    compiler_4.2.3        
##  [28] httr_1.4.5             Matrix_1.5-4           fastmap_1.1.1         
##  [31] lazyeval_0.2.2         cli_3.6.1              later_1.3.0           
##  [34] htmltools_0.5.5        tools_4.2.3            igraph_1.4.1          
##  [37] gtable_0.3.3           glue_1.6.2             reshape2_1.4.4        
##  [40] RANN_2.6.1             Rcpp_1.0.10            scattermore_0.8       
##  [43] jquerylib_0.1.4        vctrs_0.6.1            nlme_3.1-162          
##  [46] spatstat.explore_3.1-0 progressr_0.13.0       lmtest_0.9-40         
##  [49] spatstat.random_3.1-4  xfun_0.38              globals_0.16.2        
##  [52] timechange_0.2.0       mime_0.12              miniUI_0.1.1.1        
##  [55] lifecycle_1.0.3        irlba_2.3.5.1          goftest_1.2-3         
##  [58] future_1.32.0          MASS_7.3-58.3          zoo_1.8-11            
##  [61] scales_1.2.1           spatstat.utils_3.0-2   hms_1.1.3             
##  [64] promises_1.2.0.1       parallel_4.2.3         RColorBrewer_1.1-3    
##  [67] yaml_2.3.7             gridExtra_2.3          reticulate_1.28       
##  [70] pbapply_1.7-0          sass_0.4.5             stringi_1.7.12        
##  [73] rlang_1.1.0            pkgconfig_2.0.3        matrixStats_0.63.0    
##  [76] evaluate_0.20          lattice_0.21-8         tensor_1.5            
##  [79] ROCR_1.0-11            htmlwidgets_1.6.2      cowplot_1.1.1         
##  [82] tidyselect_1.2.0       here_1.0.1             parallelly_1.35.0     
##  [85] RcppAnnoy_0.0.20       plyr_1.8.8             magrittr_2.0.3        
##  [88] R6_2.5.1               generics_0.1.3         DBI_1.1.3             
##  [91] pillar_1.9.0           withr_2.5.0            fitdistrplus_1.1-8    
##  [94] abind_1.4-5            survival_3.5-5         sp_1.6-0              
##  [97] future.apply_1.10.0    KernSmooth_2.23-20     utf8_1.2.3            
## [100] spatstat.geom_3.1-0    plotly_4.10.1          tzdb_0.3.0            
## [103] rmarkdown_2.21         grid_4.2.3             data.table_1.14.8     
## [106] digest_0.6.31          xtable_1.8-4           httpuv_1.6.9          
## [109] munsell_0.5.0          viridisLite_0.4.1      bslib_0.4.2

If you want to use a specific function from a package you did NOT load by library command, you can always use [library name]::[function name] which I am going to do in a minute with tictoc library:

tictoc::tic() # this is a function from the tictoc package
tictoc::toc() # this is a function from the tictoc package
## 0.001 sec elapsed

Why is it (sometimes) important? Sometimes, if you load too many libraries, depending on the order, some functions from different packages with identical names can be overridden, and you create an ambiguity what function to use, so it might be necessary to specify where the function is coming from.

4 Load toolkit

Konrad has very useful functions that you can also load. Notice after running this code chunk the changes in environment (typically top right panel):

# Not necessary for the practice, but useful
source("https://raw.githubusercontent.com/xenbase-hub/workshop/main/toolbox.R")

5 Load 10X count data to Seurat

5.1 Load original data and perform inspection

Now let’s read a 10X cellranger generated count matrix. Seurat has a handy function called Read10X to load the data into a sparse matrix format (dgCMatrix). In the class, we will check where these data are coming from, but it suffices to provide the directory where the necessary files are present:

tictoc::tic()
xenopus.data <- Read10X(data.dir = "./scCapSt27_count/outs/filtered_gene_bc_matrices/XENLA_GCA001663975v1_XBv9p2/")
# xenopus.data <- Read10X(data.dir = "./scCapSt27_xen10_1_20230408/outs/filtered_feature_bc_matrix/")

tictoc::toc()
## 1.826 sec elapsed
class(xenopus.data)
## [1] "dgCMatrix"
## attr(,"package")
## [1] "Matrix"
head(xenopus.data[,1:30]) # only the first 30 cellular barcodes
## 6 x 30 sparse Matrix of class "dgCMatrix"
##   [[ suppressing 30 column names 'AAACCTGAGCTATGCT-1', 'AAACCTGAGGGTTCCC-1', 'AAACCTGCAGATGGGT-1' ... ]]
##                                                                                
## gene25011|Xelaev18004747m   . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene21250|Xetrov90028798m.L . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene27977|Xelaev18004749m   . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene26149|Xelaev18004750m   . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene25611|Xelaev18004751m   . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene30800|Xelaev18004752m   . . . . . . . . . . . . . . . . . . . . . . . . . .
##                                    
## gene25011|Xelaev18004747m   . . . .
## gene21250|Xetrov90028798m.L . . . .
## gene27977|Xelaev18004749m   . . . .
## gene26149|Xelaev18004750m   . . . .
## gene25611|Xelaev18004751m   . . . .
## gene30800|Xelaev18004752m   . . . .

For this matrix, the rows represent genes(features), and the columns represent cellular barcodes. You can have a peek of how the cell names are represented:

head( colnames(xenopus.data) )
## [1] "AAACCTGAGCTATGCT-1" "AAACCTGAGGGTTCCC-1" "AAACCTGCAGATGGGT-1"
## [4] "AAACCTGCAGCCACCA-1" "AAACCTGGTCTGCCAG-1" "AAACCTGGTTCACGGC-1"
nchar("AAACCTGAGCTATGCT-1")
## [1] 18

This is typical output from cellranger where 16bp barcode sequence is suffixed with -1.

Now let’s make a SeuratObject that is used in Seurat package. With some parameters, you can already do some filtering steps here.

xenopus <- CreateSeuratObject( 
  counts = xenopus.data,  # Here you put your count matrix
  project = "XenopusBioInfo2023",   # This is just a handy name attached to the object
  min.cells = 3,  # At least 3 cells should have a particular gene expressed
  min.features = 200 # At least a cell should have 200 genes detected to be included
)
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')
## Warning: Feature names cannot have pipe characters ('|'), replacing with dashes
## ('-')

It is often helpful to pay attention to the warning signs that arise. Here you have two warning messages. Let’s check what it means.

First, “Feature names cannot have underscores”. Are there genes that have underscores?

# This is a common UNIX command that is appropriated to R
grep("_", rownames(xenopus.data), value = T)
## [1] "gene16511|car1_predicted.S" "gene134|hes5_X2.L"         
## [3] "gene9235|hes5_X1.L"         "gene13524|hes5_X2.S"

Yes, there are four gene names that contain a underscore. hes5 sounds familiar, want to check whether there were any issues with this gene:

grep("hes5", rownames(xenopus.data), value = T)
## [1] "gene19724|hes5.2.L"  "gene18133|hes5.1.L"  "gene134|hes5_X2.L"  
## [4] "gene9235|hes5_X1.L"  "gene34361|hes5.2.S"  "gene37268|hes5.1.S" 
## [7] "gene13524|hes5_X2.S"

As you can see, there are 7 different feature names associated with hes5, more than the usual L and S forms. It might be helpful to go back and see whether they represent something in the JBrowser.

Let’s check the feature/gene names in the original count matrix loaded:

rownames(xenopus.data) %>% head()
## [1] "gene25011|Xelaev18004747m"   "gene21250|Xetrov90028798m.L"
## [3] "gene27977|Xelaev18004749m"   "gene26149|Xelaev18004750m"  
## [5] "gene25611|Xelaev18004751m"   "gene30800|Xelaev18004752m"

As you can see, the cellranger generated gene names have a format that contains “|”. How many are there?

grep("|", rownames(xenopus.data), value = T) %>% length()
## [1] 41560
nrow(xenopus.data)
## [1] 41560

So the entire genes are named with this format, so with a warning, importing this count matrix to Seurat object the CreateSeuratObject function did the following:

grep("hes5", rownames(xenopus), value = T)
## [1] "gene134-hes5-X2.L"

There are two things here - one that that gene134|hes5_X2.L characters of “_” and “|” are all replaced to “-”.

Quiz: Where are the other 6 hes5 genes that were found in the original sparse count matrix?

You can answer this here (by changing the Markdown file).

We can also check the changes of cell numbers here during the import:

ncol(xenopus.data)
## [1] 5263
ncol(xenopus)
## [1] 5085

For most of the standard workflows of scRNA-seq analysis, you are interested in a category of genes that are together. One important category that is almost always presented in tutorials are the mitochondrial genes. They are special in that their RNA source is in a different subcellular compartment. Dying cells tend to have more enriched fraction of these mitochondrial genes to other nuclear genes. Do you have genes in this count matrix?

Let’s guess (which many tutorials do) whether a usual name is present in the gene list:

grep("cytb", rownames(xenopus.data), value = T)
## character(0)
grep("CYTB", rownames(xenopus.data), value = T)
## character(0)

The best way would be to go back to the reference annotation you used to build the STAR index that cell ranger used to generate the count matrix.

Quiz: Can you identify the meta information that you can retrieve from the GEO to check whether mitochondrial chromosome is present and identify the mitochondrial gene names?

Below are some potential way to extract quickly the present chromosomal information as well as the gene names when given a GTF file (here the GTF file is from our most up-to-date Xenla10.1)

# Example code in console. You could potentially use chunk header bash instead of r to run this also in the document.

# For linux
# zcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq
# For Mac OSX
# gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq
# gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '( $1 == "chrM" )'
gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq

Do your own work

grep("44447", rownames(xenopus.data), value = T)
## [1] "gene44447|LOC108708778"

5.2 Load count matrix from the most up-to-date reference

As I presented yesterday, I have retrieved the archived SRA.lite files and generated FASTQ files to run cellranger (6.0.1) again to generate count matrices with the reference that contains mitochondrial genes.

Because of the version difference of cellranger to generate the count matrices is different from the original ones (Quiz: what version is it?), you have a slightly different directory structure for loading. Let’s re-do all the steps we did in preparation for the standard workflow

One side comment:

It is generally not recommended to re-use (override) same variable names for the reproducibility’s sake - one practice is to clean up this RMarkdown file for a final version which does not make the detour of loading the original count matrix, or make more explicit rules to track the variable names associated for a particular dataset. Here, for the sake of being explicit, we will clean up the previous variables and override.

rm(xenopus.data)
rm(xenopus)

tictoc::tic()
xenopus.data <- Read10X(data.dir = "./scCapSt27_xen10_1_20230408/outs/filtered_feature_bc_matrix/")
tictoc::toc()
## 1.432 sec elapsed
xenopus <- CreateSeuratObject( 
  counts = xenopus.data,  # Here you put your count matrix
  project = "XenopusBioInfo2023",   # This is just a handy name attached to the object
  min.cells = 3,  # At least 3 cells should have a particular gene expressed
  min.features = 200 # At least a cell should have 200 genes detected to be included
)
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')

So again, worthwhile to dig in a bit:

# This is a common UNIX command that is appropriated to R
grep("_", rownames(xenopus.data), value = T)
##    [1] "trnar-acg_1"        "trnar-acg_2"        "trnav-cac_1"       
##    [4] "trnae-cuc_1"        "trnav-aac_1"        "trnar-ccu_1"       
##    [7] "trnar-acg_3"        "trnar-acg_4"        "trnae-cuc_2"       
##   [10] "trnav-aac_2"        "trnar-acg_5"        "trnav-aac_3"       
##   [13] "trnar-acg_6"        "trnav-cac_2"        "trnav-aac_4"       
##   [16] "trnav-cac_3"        "trnae-cuc_3"        "trnah-gug_1"       
##   [19] "trnav-aac_5"        "trnae-cuc_4"        "trnav-cac_4"       
##   [22] "trnar-acg_7"        "trnah-gug_2"        "trnav-aac_6"       
##   [25] "trnae-cuc_5"        "trnav-cac_5"        "trnar-acg_8"       
##   [28] "trnar-ccu_2"        "trnah-gug_3"        "trnav-aac_7"       
##   [31] "trnae-cuc_6"        "trnar-acg_9"        "trnar-ccu_3"       
##   [34] "trnav-aac_8"        "trnae-cuc_7"        "trnav-cac_6"       
##   [37] "trnar-acg_10"       "trnar-ccu_4"        "trnah-gug_4"       
##   [40] "trnav-aac_9"        "trnae-cuc_8"        "trnav-cac_7"       
##   [43] "trnar-acg_11"       "trnar-ccu_5"        "trnah-gug_5"       
##   [46] "trnav-cac_8"        "trnar-ccu_6"        "trnah-gug_6"       
##   [49] "trnav-aac_10"       "trnae-cuc_9"        "trnav-cac_9"       
##   [52] "trnah-aug_1"        "trnar-ccu_7"        "trnah-gug_7"       
##   [55] "trnav-aac_11"       "trnae-cuc_10"       "trnav-cac_10"      
##   [58] "trnar-acg_12"       "trnar-ccu_8"        "trnav-aac_12"      
##   [61] "trnae-cuc_11"       "trnar-acg_13"       "trnar-ccu_9"       
##   [64] "trnav-cac_11"       "trnar-acg_14"       "trnah-gug_8"       
##   [67] "trnav-aac_13"       "trnae-cuc_12"       "trnav-cac_12"      
##   [70] "trnar-acg_15"       "trnar-ccu_10"       "trnah-gug_9"       
##   [73] "trnav-aac_14"       "trnae-cuc_13"       "trnav-cac_13"      
##   [76] "trnar-acg_16"       "trnar-ccu_11"       "trnah-gug_10"      
##   [79] "trnav-aac_15"       "trnae-cuc_14"       "trnav-cac_14"      
##   [82] "trnar-acg_17"       "trnar-ccu_12"       "trnah-gug_11"      
##   [85] "trnav-aac_16"       "trnae-cuc_15"       "trnav-cac_15"      
##   [88] "trnar-acg_18"       "trnar-ccu_13"       "trnae-cuc_16"      
##   [91] "trnav-cac_16"       "trnar-acg_19"       "trnar-ccu_14"      
##   [94] "trnah-gug_12"       "trnav-aac_17"       "trnae-cuc_17"      
##   [97] "trnav-cac_17"       "trnar-acg_20"       "trnar-ccu_15"      
##  [100] "trnah-gug_13"       "trnav-aac_18"       "trnae-cuc_18"      
##  [103] "trnav-cac_18"       "trnar-acg_21"       "trnar-ccu_16"      
##  [106] "trnah-gug_14"       "trnav-aac_19"       "trnae-cuc_19"      
##  [109] "trnav-cac_19"       "trnar-acg_22"       "trnar-ccu_17"      
##  [112] "trnav-aac_20"       "trnar-acg_23"       "trnar-ccu_18"      
##  [115] "trnah-gug_15"       "trnav-aac_21"       "trnae-cuc_20"      
##  [118] "trnav-cac_20"       "trnar-acg_24"       "trnar-ccu_19"      
##  [121] "trnah-gug_16"       "trnav-aac_22"       "trnae-cuc_21"      
##  [124] "trnav-cac_21"       "trnar-acg_25"       "trnar-ccu_20"      
##  [127] "trnah-gug_17"       "trnav-aac_23"       "trnae-cuc_22"      
##  [130] "trnav-cac_22"       "trnar-acg_26"       "trnar-ccu_21"      
##  [133] "trnah-gug_18"       "trnav-aac_24"       "trnae-cuc_23"      
##  [136] "trnav-cac_23"       "trnar-acg_27"       "trnar-ccu_22"      
##  [139] "trnah-gug_19"       "trnav-aac_25"       "trnae-cuc_24"      
##  [142] "trnav-cac_24"       "trnar-acg_28"       "trnar-ccu_23"      
##  [145] "trnah-gug_20"       "trnav-aac_26"       "trnae-cuc_25"      
##  [148] "trnav-cac_25"       "trnar-acg_29"       "trnar-ccu_24"      
##  [151] "trnah-gug_21"       "trnav-aac_27"       "trnae-cuc_26"      
##  [154] "trnav-cac_26"       "trnar-acg_30"       "trnar-ccu_25"      
##  [157] "trnah-gug_22"       "trnav-aac_28"       "trnae-cuc_27"      
##  [160] "trnav-cac_27"       "trnar-acg_31"       "trnar-ccu_26"      
##  [163] "trnah-gug_23"       "trnav-cac_28"       "trnar-acg_32"      
##  [166] "trnar-ccu_27"       "trnah-gug_24"       "trnav-aac_29"      
##  [169] "trnav-cac_29"       "trnav-aac_30"       "trnav-cac_30"      
##  [172] "trnar-acg_33"       "trnav-cac_31"       "trnae-cuc_28"      
##  [175] "trnae-cuc_29"       "trnav-aac_31"       "trnav-aac_32"      
##  [178] "trnav-cac_32"       "trnar-ccu_28"       "trnah-gug_25"      
##  [181] "trnav-aac_33"       "trnae-cuc_30"       "trnav-cac_33"      
##  [184] "trnar-acg_34"       "trnah-gug_26"       "trnav-aac_34"      
##  [187] "trnae-cuc_31"       "trnav-cac_34"       "trnar-acg_35"      
##  [190] "trnar-ccu_29"       "trnav-aac_35"       "trnae-cuc_32"      
##  [193] "trnav-cac_35"       "trnar-acg_36"       "trnar-ccu_30"      
##  [196] "trnah-gug_27"       "trnav-aac_36"       "trnae-cuc_33"      
##  [199] "trnav-cac_36"       "trnar-acg_37"       "trnah-gug_28"      
##  [202] "trnav-aac_37"       "trnah-gug_29"       "trnav-aac_38"      
##  [205] "trnae-cuc_34"       "trnav-cac_37"       "trnar-acg_38"      
##  [208] "trnah-gug_30"       "trnav-aac_39"       "trnae-cuc_35"      
##  [211] "trnav-cac_38"       "trnar-acg_39"       "trnar-ccu_31"      
##  [214] "trnah-gug_31"       "trnav-aac_40"       "trnae-cuc_36"      
##  [217] "trnav-cac_39"       "trnar-acg_40"       "trnar-ccu_32"      
##  [220] "trnah-gug_32"       "trnav-aac_41"       "trnae-cuc_37"      
##  [223] "trnav-cac_40"       "trnar-acg_41"       "trnar-ccu_33"      
##  [226] "trnar-acg_42"       "trnar-ccu_34"       "trnah-gug_33"      
##  [229] "trnav-aac_42"       "trnah-aug_2"        "trnar-ccu_35"      
##  [232] "trnah-gug_34"       "trnav-aac_43"       "trnae-cuc_38"      
##  [235] "trnav-cac_41"       "trnar-acg_43"       "trnar-ccu_36"      
##  [238] "trnah-gug_35"       "trnav-aac_44"       "trnae-cuc_39"      
##  [241] "trnav-cac_42"       "trnar-acg_44"       "trnar-ccu_37"      
##  [244] "trnah-gug_36"       "trnav-aac_45"       "trnae-cuc_40"      
##  [247] "trnav-cac_43"       "trnar-acg_45"       "trnar-ccu_38"      
##  [250] "trnak-cuu_1"        "trnae-cuc_41"       "trnah-gug_37"      
##  [253] "trnay-gua_1"        "trnar-acg_46"       "trnap-ugg_1"       
##  [256] "trnak-cuu_2"        "trnaa-ugc_1"        "trnad-guc_1"       
##  [259] "trnay-gua_2"        "trnag-ucc_1"        "trnae-cuc_42"      
##  [262] "trnay-gua_3"        "trnar-acg_47"       "trnat-ugu_1"       
##  [265] "trnap-agg_1"        "trnap-ugg_2"        "trnaa-ugc_2"       
##  [268] "trnad-guc_2"        "trnaf-gaa_1"        "trnad-guc_3"       
##  [271] "trnap-agg_2"        "trnap-agg_3"        "trnad-guc_4"       
##  [274] "trnap-agg_4"        "trnap-agg_5"        "trnap-agg_6"       
##  [277] "trnaw-cca_1"        "trnad-guc_5"        "trnad-guc_6"       
##  [280] "trnai-aau_1"        "trnai-aau_2"        "trnad-guc_7"       
##  [283] "trnai-aau_3"        "trnad-guc_8"        "trnai-aau_4"       
##  [286] "trnad-guc_9"        "trnai-aau_5"        "trnai-aau_6"       
##  [289] "trnad-guc_10"       "trnaw-cca_2"        "trnaw-cca_3"       
##  [292] "trnaw-cca_4"        "trnaw-cca_5"        "trnaw-cca_6"       
##  [295] "trnaw-cca_7"        "trnaw-cca_8"        "trnaw-cca_9"       
##  [298] "trnaw-cca_10"       "trnaw-cca_11"       "trnaw-cca_12"      
##  [301] "trnaw-cca_13"       "trnag-ccc_1"        "trnag-ccc_2"       
##  [304] "trnag-ccc_3"        "trnag-ccc_4"        "trnag-ccc_5"       
##  [307] "trnag-ccc_6"        "trnag-ccc_7"        "trnag-ccc_8"       
##  [310] "trnag-ccc_9"        "trnag-ccc_10"       "trnag-ccc_11"      
##  [313] "trnag-ccc_12"       "trnag-ccc_13"       "trnag-ccc_14"      
##  [316] "trnag-ccc_15"       "trnag-ccc_16"       "trnag-ccc_17"      
##  [319] "trnag-ccc_18"       "trnag-ccc_19"       "trnag-ccc_20"      
##  [322] "trnag-ccc_21"       "trnae-uuc_1"        "trnav-aac_46"      
##  [325] "trnap-cgg_1"        "trnap-agg_7"        "trnap-agg_8"       
##  [328] "trnav-uac_1"        "trnap-agg_9"        "trnap-agg_10"      
##  [331] "trnap-agg_11"       "trnal-uaa_1"        "trnal-uaa_2"       
##  [334] "trnal-uaa_3"        "trnal-uaa_4"        "trnal-uaa_5"       
##  [337] "trnal-uaa_6"        "trnal-uaa_7"        "trnal-uaa_8"       
##  [340] "trnal-uaa_9"        "trnal-uaa_10"       "trnal-uaa_11"      
##  [343] "trnal-uaa_12"       "trnal-uaa_13"       "trnak-cuu_3"       
##  [346] "trnaw-cca_14"       "trnaw-cca_15"       "trnak-cuu_4"       
##  [349] "trnak-cuu_5"        "trnag-gcc_1"        "trnak-cuu_6"       
##  [352] "trnag-gcc_2"        "trnak-cuu_7"        "trnag-gcc_3"       
##  [355] "trnak-cuu_8"        "trnag-gcc_4"        "trnak-cuu_9"       
##  [358] "trnag-gcc_5"        "trnak-cuu_10"       "trnag-gcc_6"       
##  [361] "trnak-cuu_11"       "trnak-cuu_12"       "trnag-gcc_7"       
##  [364] "trnak-cuu_13"       "trnag-gcc_8"        "trnak-cuu_14"      
##  [367] "trnag-gcc_9"        "trnag-gcc_10"       "trnag-gcc_11"      
##  [370] "trnak-cuu_15"       "trnad-guc_11"       "trnac-gca_1"       
##  [373] "trnae-uuc_2"        "trnas-cga_1"        "trnae-uuc_3"       
##  [376] "trnan-guu_1"        "trnav-aac_47"       "trnav-aac_48"      
##  [379] "trnaq-cug_1"        "trnas-aga_1"        "trnas-aga_2"       
##  [382] "trnas-uga_1"        "trnas-aga_3"        "trnas-uga_2"       
##  [385] "trnaq-cug_2"        "trnas-aga_4"        "trnas-uga_3"       
##  [388] "trnas-uga_4"        "trnak-cuu_16"       "trnag-gcc_12"      
##  [391] "trnak-cuu_17"       "trnag-gcc_13"       "trnak-cuu_18"      
##  [394] "trnag-gcc_14"       "trnak-cuu_19"       "trnag-gcc_15"      
##  [397] "trnak-cuu_20"       "trnag-gcc_16"       "trnak-cuu_21"      
##  [400] "trnag-gcc_17"       "trnak-cuu_22"       "trnag-gcc_18"      
##  [403] "trnak-cuu_23"       "trnag-gcc_19"       "trnak-cuu_24"      
##  [406] "trnag-gcc_20"       "trnak-cuu_25"       "trnag-gcc_21"      
##  [409] "trnak-cuu_26"       "trnag-gcc_22"       "trnak-cuu_27"      
##  [412] "trnag-gcc_23"       "trnak-cuu_28"       "trnag-gcc_24"      
##  [415] "trnak-cuu_29"       "trnag-gcc_25"       "trnak-cuu_30"      
##  [418] "trnag-gcc_26"       "trnak-cuu_31"       "trnag-gcc_27"      
##  [421] "trnak-cuu_32"       "trnag-gcc_28"       "trnak-cuu_33"      
##  [424] "trnag-gcc_29"       "trnak-cuu_34"       "trnag-gcc_30"      
##  [427] "trnak-cuu_35"       "trnag-gcc_31"       "trnak-cuu_36"      
##  [430] "trnag-gcc_32"       "trnak-cuu_37"       "trnag-gcc_33"      
##  [433] "trnak-cuu_38"       "trnag-gcc_34"       "trnak-cuu_39"      
##  [436] "trnag-gcc_35"       "trnak-cuu_40"       "trnag-gcc_36"      
##  [439] "trnak-cuu_41"       "trnak-cuu_42"       "trnag-gcc_37"      
##  [442] "trnak-cuu_43"       "trnag-gcc_38"       "trnak-cuu_44"      
##  [445] "trnag-gcc_39"       "trnak-cuu_45"       "trnag-gcc_40"      
##  [448] "trnak-cuu_46"       "trnag-gcc_41"       "trnak-cuu_47"      
##  [451] "trnag-gcc_42"       "trnas-uga_5"        "trnal-uaa_14"      
##  [454] "trnag-ccc_22"       "trnag-ccc_23"       "trnag-ccc_24"      
##  [457] "trnag-ccc_25"       "trnag-ccc_26"       "trnag-ccc_27"      
##  [460] "trnag-ccc_28"       "trnag-ccc_29"       "trnag-ccc_30"      
##  [463] "trnat-ugu_2"        "trnad-guc_12"       "trnap-ugg_3"       
##  [466] "trnap-agg_12"       "trnah-gug_38"       "trnah-gug_39"      
##  [469] "trnak-uuu_1"        "trnai-aau_7"        "trnad-guc_13"      
##  [472] "trnad-guc_14"       "trnad-guc_15"       "trnad-guc_16"      
##  [475] "trnad-guc_17"       "trnad-guc_18"       "trnad-guc_19"      
##  [478] "trnad-guc_20"       "trnad-guc_21"       "trnad-guc_22"      
##  [481] "trnad-guc_23"       "trnad-guc_24"       "trnad-guc_25"      
##  [484] "trnad-guc_26"       "trnad-guc_27"       "trnad-guc_28"      
##  [487] "trnad-guc_29"       "trnad-guc_30"       "trnad-guc_31"      
##  [490] "trnad-guc_32"       "trnad-guc_33"       "trnad-guc_34"      
##  [493] "trnad-guc_35"       "trnad-guc_36"       "trnad-guc_37"      
##  [496] "trnad-guc_38"       "trnad-guc_39"       "trnad-guc_40"      
##  [499] "trnad-guc_41"       "trnad-guc_42"       "trnad-guc_43"      
##  [502] "trnad-guc_44"       "trnad-guc_45"       "trnad-guc_46"      
##  [505] "trnad-guc_47"       "trnad-guc_48"       "trnad-guc_49"      
##  [508] "trnad-guc_50"       "trnad-guc_51"       "trnad-guc_52"      
##  [511] "trnad-guc_53"       "trnad-guc_54"       "trnar-acg_48"      
##  [514] "trnav-cac_44"       "trnaq-cug_3"        "trnak-cuu_48"      
##  [517] "trnae-uuc_4"        "trnag-ccc_31"       "trnap-ugg_4"       
##  [520] "trnap-ugg_5"        "trnaq-uug_1"        "trnaq-uug_2"       
##  [523] "trnas-aga_5"        "trnaq-uug_3"        "trnas-aga_6"       
##  [526] "trnaq-uug_4"        "trnas-aga_7"        "trnaq-uug_5"       
##  [529] "trnaq-uug_6"        "trnaq-uug_7"        "trnaq-cug_4"       
##  [532] "trnaq-uug_8"        "trnas-aga_8"        "trnaq-cug_5"       
##  [535] "trnas-aga_9"        "trnaq-cug_6"        "trnaq-uug_9"       
##  [538] "trnas-aga_10"       "trnaq-cug_7"        "trnaq-uug_10"      
##  [541] "trnas-aga_11"       "trnaq-cug_8"        "trnaq-uug_11"      
##  [544] "trnas-aga_12"       "trnaq-cug_9"        "trnaq-uug_12"      
##  [547] "trnas-aga_13"       "trnaq-cug_10"       "trnaq-uug_13"      
##  [550] "trnas-aga_14"       "trnaq-cug_11"       "trnaq-uug_14"      
##  [553] "trnas-aga_15"       "trnaq-cug_12"       "trnaq-uug_15"      
##  [556] "trnas-aga_16"       "trnaq-cug_13"       "trnaq-uug_16"      
##  [559] "trnas-aga_17"       "trnaq-cug_14"       "trnaq-uug_17"      
##  [562] "trnas-aga_18"       "trnaq-cug_15"       "trnaq-uug_18"      
##  [565] "trnas-aga_19"       "trnaq-cug_16"       "trnaq-uug_19"      
##  [568] "trnas-aga_20"       "trnaq-cug_17"       "trnaq-uug_20"      
##  [571] "trnas-aga_21"       "trnaq-uug_21"       "trnae-uuc_5"       
##  [574] "trnag-ucc_2"        "trnak-uuu_2"        "trnam-cau_1"       
##  [577] "trnam-cau_2"        "trnav-aac_49"       "trnam-cau_3"       
##  [580] "trnag-ucc_3"        "trnak-cuu_49"       "trnam-cau_4"       
##  [583] "trnav-aac_50"       "trnam-cau_5"        "trnav-aac_51"      
##  [586] "trnae-uuc_6"        "trnag-ucc_4"        "trnav-aac_52"      
##  [589] "trnag-ucc_5"        "trnam-cau_6"        "trnav-aac_53"      
##  [592] "trnav-cac_45"       "trnak-cuu_50"       "trnav-cac_46"      
##  [595] "trnak-cuu_51"       "trnav-cac_47"       "trnae-uuc_7"       
##  [598] "trnak-uuu_3"        "trnam-cau_7"        "trnak-uuu_4"       
##  [601] "trnav-cac_48"       "trnan-guu_2"        "trnam-cau_8"       
##  [604] "trnak-uuu_5"        "trnan-guu_3"        "trnan-guu_4"       
##  [607] "trnav-cac_49"       "trnam-cau_9"        "trnak-uuu_6"       
##  [610] "trnan-guu_5"        "trnav-cac_50"       "trnav-cac_51"      
##  [613] "trnam-cau_10"       "trnan-guu_6"        "trnae-cuc_43"      
##  [616] "trnam-cau_11"       "trnan-guu_7"        "trnav-cac_52"      
##  [619] "trnae-cuc_44"       "trnae-cuc_45"       "trnam-cau_12"      
##  [622] "trnae-cuc_46"       "trnan-guu_8"        "trnav-cac_53"      
##  [625] "trnak-cuu_52"       "trnah-gug_40"       "trnav-aac_54"      
##  [628] "trnak-uuu_7"        "trnah-gug_41"       "trnah-gug_42"      
##  [631] "trnak-uuu_8"        "trnah-gug_43"       "trnak-uuu_9"       
##  [634] "trnak-uuu_10"       "trnak-cuu_53"       "trnak-uuu_11"      
##  [637] "trnar-acg_49"       "trnar-acg_50"       "trnar-acg_51"      
##  [640] "trnar-acg_52"       "trnar-acg_53"       "trnar-acg_54"      
##  [643] "trnar-acg_55"       "trnah-gug_44"       "trnae-cuc_47"      
##  [646] "trnar-ccu_39"       "trnah-gug_45"       "trnaa-agc_1"       
##  [649] "trnae-uuc_8"        "trnag-ucc_6"        "trnav-cac_54"      
##  [652] "trnag-gcc_43"       "trnak-uuu_12"       "trnav-aac_55"      
##  [655] "trnak-cuu_54"       "trnak-uuu_13"       "trnag-ucc_7"       
##  [658] "trnav-aac_56"       "trnak-cuu_55"       "trnag-gcc_44"      
##  [661] "trnag-ucc_8"        "trnak-uuu_14"       "trnar-acg_56"      
##  [664] "trnar-acg_57"       "trnav-aac_57"       "trnak-cuu_56"      
##  [667] "trnag-gcc_45"       "trnag-ucc_9"        "trnav-aac_58"      
##  [670] "trnak-cuu_57"       "trnak-uuu_15"       "trnar-acg_58"      
##  [673] "trnar-acg_59"       "trnar-acg_60"       "trnar-acg_61"      
##  [676] "trnak-uuu_16"       "trnak-cuu_58"       "trnav-aac_59"      
##  [679] "trnag-ucc_10"       "trnar-ccu_40"       "trnak-uuu_17"      
##  [682] "trnak-cuu_59"       "trnav-aac_60"       "trnag-ucc_11"      
##  [685] "trnah-gug_46"       "trnav-cac_55"       "trnae-cuc_48"      
##  [688] "trnar-ccu_41"       "trnak-uuu_18"       "trnag-gcc_46"      
##  [691] "trnak-cuu_60"       "trnav-aac_61"       "trnar-ccu_42"      
##  [694] "trnak-uuu_19"       "trnag-ucc_12"       "trnak-cuu_61"      
##  [697] "trnav-cac_56"       "trnar-ccu_43"       "trnak-uuu_20"      
##  [700] "trnag-ucc_13"       "trnah-gug_47"       "trnak-cuu_62"      
##  [703] "trnav-cac_57"       "trnae-cuc_49"       "trnar-ccu_44"      
##  [706] "trnar-ccu_45"       "trnav-aac_62"       "trnah-gug_48"      
##  [709] "trnag-gcc_47"       "trnag-ucc_14"       "trnag-ucc_15"      
##  [712] "trnag-gcc_48"       "trnav-aac_63"       "trnae-cuc_50"      
##  [715] "trnak-uuu_21"       "trnah-gug_49"       "trnag-gcc_49"      
##  [718] "trnag-gcc_50"       "trnae-cuc_51"       "trnah-gug_50"      
##  [721] "trnar-ccu_46"       "trnar-ccu_47"       "trnar-ccu_48"      
##  [724] "trnal-uag_1"        "trnap-agg_13"       "trnan-guu_9"       
##  [727] "trnak-cuu_63"       "trnal-uag_2"        "trnap-agg_14"      
##  [730] "trnak-cuu_64"       "trnal-aag_1"        "trnap-agg_15"      
##  [733] "trnak-cuu_65"       "trnal-aag_2"        "trnak-cuu_66"      
##  [736] "trnal-aag_3"        "trnap-ugg_6"        "trnaf-gaa_2"       
##  [739] "trnam-cau_13"       "trnai-aau_8"        "trnai-aau_9"       
##  [742] "trnai-aau_10"       "trnai-aau_11"       "trnai-aau_12"      
##  [745] "trnai-aau_13"       "trnai-aau_14"       "trnas-gcu_1"       
##  [748] "trnai-aau_15"       "trnai-aau_16"       "trnal-aag_4"       
##  [751] "trnap-cgg_2"        "trnaw-cca_16"       "trnas-gcu_2"       
##  [754] "trnal-aag_5"        "trnap-ugg_7"        "trnal-aag_6"       
##  [757] "trnap-agg_16"       "trnal-uag_3"        "trnag-gcc_51"      
##  [760] "trnag-gcc_52"       "trnag-gcc_53"       "trnag-gcc_54"      
##  [763] "trnag-gcc_55"       "trnag-gcc_56"       "trnag-gcc_57"      
##  [766] "trnal-aag_7"        "trnal-aag_8"        "trnal-uag_4"       
##  [769] "trnas-gcu_3"        "trnai-aau_17"       "trnas-cga_2"       
##  [772] "trnas-gcu_4"        "trnap-ugg_8"        "trnap-agg_17"      
##  [775] "trnat-agu_1"        "trnap-ugg_9"        "trnat-agu_2"       
##  [778] "trnap-ugg_10"       "trnas-gcu_5"        "trnat-cgu_1"       
##  [781] "trnat-cgu_2"        "trnat-agu_3"        "trnas-uga_6"       
##  [784] "trnap-ugg_11"       "trnas-gcu_6"        "trnat-cgu_3"       
##  [787] "trnas-uga_7"        "trnap-ugg_12"       "trnas-gcu_7"       
##  [790] "trnat-cgu_4"        "trnas-uga_8"        "trnas-gcu_8"       
##  [793] "trnat-cgu_5"        "trnat-agu_4"        "trnas-uga_9"       
##  [796] "trnas-gcu_9"        "trnat-cgu_6"        "trnat-agu_5"       
##  [799] "trnas-uga_10"       "trnat-cgu_7"        "trnat-agu_6"       
##  [802] "trnat-agu_7"        "trnas-uga_11"       "trnat-agu_8"       
##  [805] "trnas-cga_3"        "trnat-cgu_8"        "trnat-agu_9"       
##  [808] "trnat-agu_10"       "trnas-cga_4"        "trnat-agu_11"      
##  [811] "trnas-cga_5"        "trnat-cgu_9"        "trnat-agu_12"      
##  [814] "trnas-cga_6"        "trnat-agu_13"       "trnal-aag_9"       
##  [817] "trnak-cuu_67"       "trnat-agu_14"       "trnak-cuu_68"      
##  [820] "trnat-ugu_3"        "trnat-ugu_4"        "trnak-cuu_69"      
##  [823] "trnak-cuu_70"       "trnat-ugu_5"        "trnak-cuu_71"      
##  [826] "trnak-cuu_72"       "trnat-agu_15"       "trnak-cuu_73"      
##  [829] "trnat-agu_16"       "trnak-cuu_74"       "trnat-agu_17"      
##  [832] "trnak-cuu_75"       "trnak-cuu_76"       "trnat-agu_18"      
##  [835] "trnak-cuu_77"       "trnak-cuu_78"       "trnat-agu_19"      
##  [838] "trnak-cuu_79"       "trnas-gcu_10"       "trnat-cgu_10"      
##  [841] "trnas-gcu_11"       "trnat-cgu_11"       "trnat-agu_20"      
##  [844] "trnat-ugu_6"        "trnat-cgu_12"       "trnas-gcu_12"      
##  [847] "trnat-agu_21"       "trnat-cgu_13"       "trnas-gcu_13"      
##  [850] "trnas-gcu_14"       "trnat-cgu_14"       "trnat-agu_22"      
##  [853] "trnas-gcu_15"       "trnat-agu_23"       "trnat-agu_24"      
##  [856] "trnat-ugu_7"        "trnag-gcc_58"       "trnat-agu_25"      
##  [859] "trnat-ugu_8"        "trnas-gcu_16"       "trnat-agu_26"      
##  [862] "trnat-agu_27"       "trnat-ugu_9"        "trnas-gcu_17"      
##  [865] "trnat-agu_28"       "trnat-cgu_15"       "trnas-gcu_18"      
##  [868] "trnat-agu_29"       "trnat-cgu_16"       "trnat-ugu_10"      
##  [871] "trnas-gcu_19"       "trnat-agu_30"       "trnat-cgu_17"      
##  [874] "trnas-gcu_20"       "trnas-gcu_21"       "trnat-cgu_18"      
##  [877] "trnat-cgu_19"       "trnas-gcu_22"       "trnat-cgu_20"      
##  [880] "trnat-agu_31"       "trnat-cgu_21"       "trnat-agu_32"      
##  [883] "trnat-agu_33"       "trnat-cgu_22"       "trnas-gcu_23"      
##  [886] "trnat-ugu_11"       "trnas-gcu_24"       "trnat-cgu_23"      
##  [889] "trnas-gcu_25"       "trnat-cgu_24"       "trnas-gcu_26"      
##  [892] "trnat-cgu_25"       "trnas-gcu_27"       "trnat-agu_34"      
##  [895] "trnat-cgu_26"       "trnas-gcu_28"       "trnat-cgu_27"      
##  [898] "trnas-gcu_29"       "trnas-gcu_30"       "trnat-ugu_12"      
##  [901] "trnas-gcu_31"       "trnat-agu_35"       "trnat-ugu_13"      
##  [904] "trnas-cga_7"        "trnat-cgu_28"       "trnat-ugu_14"      
##  [907] "trnat-ugu_15"       "trnat-ugu_16"       "trnat-agu_36"      
##  [910] "trnat-ugu_17"       "trnat-cgu_29"       "trnat-ugu_18"      
##  [913] "trnat-agu_37"       "trnat-ugu_19"       "trnat-agu_38"      
##  [916] "trnat-ugu_20"       "trnat-agu_39"       "trnat-ugu_21"      
##  [919] "trnat-agu_40"       "trnat-ugu_22"       "trnat-agu_41"      
##  [922] "trnat-ugu_23"       "trnat-agu_42"       "trnat-ugu_24"      
##  [925] "trnat-agu_43"       "trnat-ugu_25"       "trnat-agu_44"      
##  [928] "trnat-ugu_26"       "trnat-agu_45"       "trnat-ugu_27"      
##  [931] "trnat-agu_46"       "trnat-ugu_28"       "trnat-agu_47"      
##  [934] "trnat-ugu_29"       "trnat-agu_48"       "trnat-ugu_30"      
##  [937] "trnat-agu_49"       "trnat-ugu_31"       "trnat-agu_50"      
##  [940] "trnat-ugu_32"       "trnat-agu_51"       "trnat-ugu_33"      
##  [943] "trnat-agu_52"       "trnat-ugu_34"       "trnat-ugu_35"      
##  [946] "trnat-ugu_36"       "trnat-agu_53"       "trnat-ugu_37"      
##  [949] "trnat-agu_54"       "trnat-cgu_30"       "trnat-ugu_38"      
##  [952] "trnas-uga_12"       "trnap-ugg_13"       "trnap-ugg_14"      
##  [955] "trnag-ccc_32"       "trnae-uuc_9"        "trnak-cuu_80"      
##  [958] "trnaq-cug_18"       "trnav-cac_58"       "trnar-acg_62"      
##  [961] "trnar-acg_63"       "trnai-aau_18"       "trnag-gcc_59"      
##  [964] "trnai-aau_19"       "trnad-guc_55"       "trnad-guc_56"      
##  [967] "trnai-aau_20"       "trnad-guc_57"       "trnad-guc_58"      
##  [970] "trnad-guc_59"       "trnad-guc_60"       "trnad-guc_61"      
##  [973] "trnad-guc_62"       "trnad-guc_63"       "trnai-aau_21"      
##  [976] "trnad-guc_64"       "trnad-guc_65"       "trnai-aau_22"      
##  [979] "trnad-guc_66"       "trnad-guc_67"       "trnai-aau_23"      
##  [982] "trnad-guc_68"       "trnad-guc_69"       "trnai-aau_24"      
##  [985] "trnad-guc_70"       "trnad-guc_71"       "trnai-aau_25"      
##  [988] "trnad-guc_72"       "trnad-guc_73"       "trnai-aau_26"      
##  [991] "trnad-guc_74"       "trnad-guc_75"       "trnai-aau_27"      
##  [994] "trnad-guc_76"       "trnad-guc_77"       "trnai-aau_28"      
##  [997] "trnad-guc_78"       "trnad-guc_79"       "trnai-aau_29"      
## [1000] "trnad-guc_80"       "trnai-aau_30"       "trnad-guc_81"      
## [1003] "trnai-aau_31"       "trnai-aau_32"       "trnad-guc_82"      
## [1006] "trnai-aau_33"       "trnad-guc_83"       "trnad-guc_84"      
## [1009] "trnai-aau_34"       "trnad-guc_85"       "trnad-guc_86"      
## [1012] "trnai-aau_35"       "trnad-guc_87"       "trnad-guc_88"      
## [1015] "trnai-aau_36"       "trnap-agg_18"       "trnat-agu_55"      
## [1018] "trnap-ugg_15"       "trnad-guc_89"       "trnat-ugu_39"      
## [1021] "trnag-ccc_33"       "trnag-ccc_34"       "trnag-ccc_35"      
## [1024] "trnaa-ugc_3"        "trnak-uuu_22"       "trnaf-gaa_3"       
## [1027] "trnay-gua_4"        "trnam-cau_14"       "trnan-guu_10"      
## [1030] "trnaa-ugc_4"        "trnal-cag_1"        "trnaf-gaa_4"       
## [1033] "trnam-cau_15"       "trnaa-ugc_5"        "trnal-cag_2"       
## [1036] "trnak-uuu_23"       "trnaf-gaa_5"        "trnay-gua_5"       
## [1039] "trnam-cau_16"       "trnam-cau_17"       "trnaa-ugc_6"       
## [1042] "trnak-uuu_24"       "trnaf-gaa_6"        "trnay-gua_6"       
## [1045] "trnam-cau_18"       "trnan-guu_11"       "trnaa-ugc_7"       
## [1048] "trnal-cag_3"        "trnak-uuu_25"       "trnam-cau_19"      
## [1051] "trnam-cau_20"       "trnan-guu_12"       "trnaa-ugc_8"       
## [1054] "trnal-cag_4"        "trnak-uuu_26"       "trnaf-gaa_7"       
## [1057] "trnay-gua_7"        "trnam-cau_21"       "trnam-cau_22"      
## [1060] "trnan-guu_13"       "trnaa-ugc_9"        "trnal-cag_5"       
## [1063] "trnak-uuu_27"       "trnaf-gaa_8"        "trnay-gua_8"       
## [1066] "trnam-cau_23"       "trnam-cau_24"       "trnan-guu_14"      
## [1069] "trnaa-ugc_10"       "trnal-cag_6"        "trnak-uuu_28"      
## [1072] "trnaf-gaa_9"        "trnay-gua_9"        "trnam-cau_25"      
## [1075] "trnam-cau_26"       "trnan-guu_15"       "trnaa-ugc_11"      
## [1078] "trnal-cag_7"        "trnak-uuu_29"       "trnaf-gaa_10"      
## [1081] "trnay-gua_10"       "trnam-cau_27"       "trnam-cau_28"      
## [1084] "trnan-guu_16"       "trnaa-ugc_12"       "trnal-cag_8"       
## [1087] "trnak-uuu_30"       "trnaf-gaa_11"       "trnay-gua_11"      
## [1090] "trnam-cau_29"       "trnam-cau_30"       "trnan-guu_17"      
## [1093] "trnaa-ugc_13"       "trnal-cag_9"        "trnak-uuu_31"      
## [1096] "trnaf-gaa_12"       "trnay-gua_12"       "trnam-cau_31"      
## [1099] "trnam-cau_32"       "trnan-guu_18"       "trnaa-ugc_14"      
## [1102] "trnal-cag_10"       "trnak-uuu_32"       "trnaf-gaa_13"      
## [1105] "trnay-gua_13"       "trnam-cau_33"       "trnam-cau_34"      
## [1108] "trnan-guu_19"       "trnaa-ugc_15"       "trnal-cag_11"      
## [1111] "trnak-uuu_33"       "trnaf-gaa_14"       "trnay-gua_14"      
## [1114] "trnam-cau_35"       "trnam-cau_36"       "trnan-guu_20"      
## [1117] "trnaa-ugc_16"       "trnal-cag_12"       "trnak-uuu_34"      
## [1120] "trnaf-gaa_15"       "trnay-gua_15"       "trnam-cau_37"      
## [1123] "trnam-cau_38"       "trnan-guu_21"       "trnaa-ugc_17"      
## [1126] "trnal-cag_13"       "trnak-uuu_35"       "trnaf-gaa_16"      
## [1129] "trnay-gua_16"       "trnam-cau_39"       "trnam-cau_40"      
## [1132] "trnan-guu_22"       "trnaa-ugc_18"       "trnal-cag_14"      
## [1135] "trnak-uuu_36"       "trnaf-gaa_17"       "trnay-gua_17"      
## [1138] "trnam-cau_41"       "trnam-cau_42"       "trnan-guu_23"      
## [1141] "trnaa-ugc_19"       "trnal-cag_15"       "trnaf-gaa_18"      
## [1144] "trnay-gua_18"       "trnam-cau_43"       "trnam-cau_44"      
## [1147] "trnan-guu_24"       "trnaa-ugc_20"       "trnal-cag_16"      
## [1150] "trnak-uuu_37"       "trnaf-gaa_19"       "trnay-gua_19"      
## [1153] "trnam-cau_45"       "trnam-cau_46"       "trnak-uuu_38"      
## [1156] "trnal-cag_17"       "trnak-uuu_39"       "trnaf-gaa_20"      
## [1159] "trnay-gua_20"       "trnam-cau_47"       "trnam-cau_48"      
## [1162] "trnan-guu_25"       "trnaa-ugc_21"       "trnal-cag_18"      
## [1165] "trnak-uuu_40"       "trnaf-gaa_21"       "trnay-gua_21"      
## [1168] "trnam-cau_49"       "trnam-cau_50"       "trnan-guu_26"      
## [1171] "trnaa-ugc_22"       "trnal-cag_19"       "trnak-uuu_41"      
## [1174] "trnaf-gaa_22"       "trnay-gua_22"       "trnam-cau_51"      
## [1177] "trnam-cau_52"       "trnan-guu_27"       "trnaa-ugc_23"      
## [1180] "trnal-cag_20"       "trnak-uuu_42"       "trnan-guu_28"      
## [1183] "trnaf-gaa_23"       "trnay-gua_23"       "trnan-guu_29"      
## [1186] "trnak-uuu_43"       "trnaf-gaa_24"       "trnay-gua_24"      
## [1189] "trnam-cau_53"       "trnal-cag_21"       "trnak-uuu_44"      
## [1192] "trnaf-gaa_25"       "trnam-cau_54"       "trnan-guu_30"      
## [1195] "trnaa-ugc_24"       "trnal-cag_22"       "trnak-uuu_45"      
## [1198] "trnaf-gaa_26"       "trnay-gua_25"       "trnas-aga_22"      
## [1201] "trnaq-uug_22"       "trnas-aga_23"       "trnaq-uug_23"      
## [1204] "trnas-aga_24"       "trnaq-uug_24"       "trnas-aga_25"      
## [1207] "trnaq-uug_25"       "trnas-aga_26"       "trnaq-uug_26"      
## [1210] "trnas-aga_27"       "trnaq-uug_27"       "trnas-aga_28"      
## [1213] "trnay-gua_26"       "trnak-uuu_46"       "trnam-cau_55"      
## [1216] "trnay-gua_27"       "trnag-ucc_16"       "trnae-uuc_10"      
## [1219] "trnaa-agc_2"        "trnaf-gaa_27"       "trnak-uuu_47"      
## [1222] "trnam-cau_56"       "trnay-gua_28"       "trnag-ucc_17"      
## [1225] "trnae-uuc_11"       "trnaa-agc_3"        "trnaf-gaa_28"      
## [1228] "trnak-uuu_48"       "trnam-cau_57"       "trnay-gua_29"      
## [1231] "trnag-ucc_18"       "trnae-uuc_12"       "trnaa-agc_4"       
## [1234] "trnaf-gaa_29"       "trnak-uuu_49"       "trnam-cau_58"      
## [1237] "trnay-gua_30"       "trnag-ucc_19"       "trnae-uuc_13"      
## [1240] "trnaa-agc_5"        "trnaf-gaa_30"       "trnak-uuu_50"      
## [1243] "trnam-cau_59"       "trnay-gua_31"       "trnag-ucc_20"      
## [1246] "trnae-uuc_14"       "trnaa-agc_6"        "trnaf-gaa_31"      
## [1249] "trnak-uuu_51"       "trnam-cau_60"       "trnay-gua_32"      
## [1252] "trnag-ucc_21"       "trnae-uuc_15"       "trnaa-agc_7"       
## [1255] "trnaf-gaa_32"       "trnak-uuu_52"       "trnam-cau_61"      
## [1258] "trnay-gua_33"       "trnag-ucc_22"       "trnae-uuc_16"      
## [1261] "trnaa-agc_8"        "trnaf-gaa_33"       "trnak-uuu_53"      
## [1264] "trnam-cau_62"       "trnay-gua_34"       "trnag-ucc_23"      
## [1267] "trnae-uuc_17"       "trnaa-agc_9"        "trnaf-gaa_34"      
## [1270] "trnak-uuu_54"       "trnam-cau_63"       "trnay-gua_35"      
## [1273] "trnag-ucc_24"       "trnae-uuc_18"       "trnaa-agc_10"      
## [1276] "trnaf-gaa_35"       "trnak-uuu_55"       "trnam-cau_64"      
## [1279] "trnag-ucc_25"       "trnak-uuu_56"       "trnam-cau_65"      
## [1282] "trnak-uuu_57"       "trnaf-gaa_36"       "trnae-uuc_19"      
## [1285] "trnam-cau_66"       "trnak-uuu_58"       "trnaf-gaa_37"      
## [1288] "trnaa-agc_11"       "trnae-uuc_20"       "trnag-ucc_26"      
## [1291] "trnay-gua_36"       "trnam-cau_67"       "trnak-uuu_59"      
## [1294] "trnaf-gaa_38"       "trnaa-agc_12"       "trnae-uuc_21"      
## [1297] "trnag-ucc_27"       "trnay-gua_37"       "trnam-cau_68"      
## [1300] "trnak-uuu_60"       "trnaf-gaa_39"       "trnaa-agc_13"      
## [1303] "trnae-uuc_22"       "trnag-ucc_28"       "trnay-gua_38"      
## [1306] "trnam-cau_69"       "trnak-uuu_61"       "trnaf-gaa_40"      
## [1309] "trnaa-agc_14"       "trnae-uuc_23"       "trnag-ucc_29"      
## [1312] "trnay-gua_39"       "trnam-cau_70"       "trnak-uuu_62"      
## [1315] "trnaf-gaa_41"       "trnaa-agc_15"       "trnae-uuc_24"      
## [1318] "trnag-ucc_30"       "trnay-gua_40"       "trnam-cau_71"      
## [1321] "trnak-uuu_63"       "trnaf-gaa_42"       "trnaa-agc_16"      
## [1324] "trnae-uuc_25"       "trnag-ucc_31"       "trnay-gua_41"      
## [1327] "trnam-cau_72"       "trnak-uuu_64"       "trnaq-uug_28"      
## [1330] "trnae-cuc_52"       "trnag-ucc_32"       "trnag-ucc_33"      
## [1333] "trnag-ucc_34"       "trnag-ucc_35"       "trnam-cau_73"      
## [1336] "trnak-cuu_81"       "trnam-cau_74"       "trnak-uuu_65"      
## [1339] "trnam-cau_75"       "trnak-uuu_66"       "trnak-uuu_67"      
## [1342] "trnai-uau_1"        "trnai-uau_2"        "trnak-cuu_82"      
## [1345] "trnav-cac_59"       "trnan-guu_31"       "trnae-cuc_53"      
## [1348] "trnam-cau_76"       "trnan-guu_32"       "trnae-cuc_54"      
## [1351] "trnav-cac_60"       "trnam-cau_77"       "trnai-uau_3"       
## [1354] "trnak-uuu_68"       "trnah-gug_51"       "trnah-gug_52"      
## [1357] "trnak-uuu_69"       "trnar-acg_64"       "trnar-acg_65"      
## [1360] "trnar-acg_66"       "trnar-acg_67"       "trnar-acg_68"      
## [1363] "trnar-acg_69"       "trnar-acg_70"       "trnar-acg_71"      
## [1366] "trnar-acg_72"       "trnar-acg_73"       "trnar-acg_74"      
## [1369] "trnak-uuu_70"       "trnar-ccu_49"       "trnak-cuu_83"      
## [1372] "trnae-cuc_55"       "trnah-gug_53"       "trnak-cuu_84"      
## [1375] "trnae-cuc_56"       "trnah-gug_54"       "trnak-cuu_85"      
## [1378] "trnae-cuc_57"       "trnah-gug_55"       "trnak-cuu_86"      
## [1381] "trnae-cuc_58"       "trnah-gug_56"       "trnak-cuu_87"      
## [1384] "trnar-ccu_50"       "trnae-cuc_59"       "trnak-cuu_88"      
## [1387] "trnai-uau_4"        "trnap-ugg_16"       "trnap-agg_19"      
## [1390] "trnai-uau_5"        "trnap-ugg_17"       "trnai-uau_6"       
## [1393] "trnap-ugg_18"       "trnap-agg_20"       "trnai-uau_7"       
## [1396] "trnap-ugg_19"       "trnap-agg_21"       "trnai-uau_8"       
## [1399] "trnap-ugg_20"       "trnap-agg_22"       "trnai-uau_9"       
## [1402] "trnap-ugg_21"       "trnai-uau_10"       "trnap-ugg_22"      
## [1405] "trnap-agg_23"       "trnai-uau_11"       "trnap-ugg_23"      
## [1408] "trnap-agg_24"       "trnai-uau_12"       "trnap-ugg_24"      
## [1411] "trnap-agg_25"       "trnai-uau_13"       "trnap-ugg_25"      
## [1414] "trnap-agg_26"       "trnai-uau_14"       "trnap-ugg_26"      
## [1417] "trnap-agg_27"       "trnai-uau_15"       "trnap-ugg_27"      
## [1420] "trnap-agg_28"       "trnai-uau_16"       "trnap-ugg_28"      
## [1423] "trnap-agg_29"       "trnai-uau_17"       "trnap-ugg_29"      
## [1426] "trnap-agg_30"       "trnai-uau_18"       "trnap-ugg_30"      
## [1429] "trnap-agg_31"       "trnai-uau_19"       "trnap-ugg_31"      
## [1432] "trnap-agg_32"       "trnai-uau_20"       "trnap-ugg_32"      
## [1435] "trnap-agg_33"       "trnai-uau_21"       "trnap-ugg_33"      
## [1438] "trnap-agg_34"       "trnai-uau_22"       "trnap-ugg_34"      
## [1441] "trnap-agg_35"       "trnai-uau_23"       "trnap-ugg_35"      
## [1444] "trnap-agg_36"       "trnai-uau_24"       "trnai-uau_25"      
## [1447] "trnap-ugg_36"       "trnai-uau_26"       "trnai-uau_27"      
## [1450] "trnap-agg_37"       "trnai-uau_28"       "trnap-ugg_37"      
## [1453] "trnay-gua_42"       "trnae-uuc_26"       "trnay-gua_43"      
## [1456] "trnam-cau_78"       "trnae-uuc_27"       "trnaq-uug_29"      
## [1459] "trnay-gua_44"       "trnag-ucc_36"       "trnaq-uug_30"      
## [1462] "trnaq-uug_31"       "trnaq-uug_32"       "trnaq-uug_33"      
## [1465] "trnah-gug_57"       "trnar-ccu_51"       "trnai-aau_37"      
## [1468] "trnai-aau_38"       "trnai-aau_39"       "trnai-aau_40"      
## [1471] "trnai-aau_41"       "trnas-gcu_32"       "trnai-aau_42"      
## [1474] "trnai-aau_43"       "trnai-aau_44"       "trnas-gcu_33"      
## [1477] "trnai-aau_45"       "trnai-aau_46"       "trnas-gcu_34"      
## [1480] "trnai-aau_47"       "trnai-aau_48"       "trnai-aau_49"      
## [1483] "trnai-aau_50"       "trnal-aag_10"       "trnap-cgg_3"       
## [1486] "trnaw-cca_17"       "trnas-gcu_35"       "trnal-aag_11"      
## [1489] "trnap-ugg_38"       "trnal-aag_12"       "trnal-aag_13"      
## [1492] "trnal-aag_14"       "trnap-agg_38"       "trnal-uag_5"       
## [1495] "trnag-gcc_60"       "trnar-ucu_1"        "trnag-gcc_61"      
## [1498] "trnag-gcc_62"       "trnag-gcc_63"       "trnag-gcc_64"      
## [1501] "trnag-gcc_65"       "trnag-gcc_66"       "trnag-gcc_67"      
## [1504] "trnal-aag_15"       "trnal-uag_6"        "trnal-aag_16"      
## [1507] "trnal-uag_7"        "trnal-uag_8"        "trnal-aag_17"      
## [1510] "trnal-uag_9"        "trnal-aag_18"       "trnas-gcu_36"      
## [1513] "trnat-cgu_31"       "trnas-gcu_37"       "trnat-agu_56"      
## [1516] "trnat-agu_57"       "trnas-gcu_38"       "trnat-agu_58"      
## [1519] "trnas-gcu_39"       "trnat-agu_59"       "trnat-agu_60"      
## [1522] "trnas-gcu_40"       "trnat-agu_61"       "trnat-agu_62"      
## [1525] "trnas-gcu_41"       "trnat-agu_63"       "trnat-agu_64"      
## [1528] "trnas-gcu_42"       "trnat-cgu_32"       "trnat-agu_65"      
## [1531] "trnat-agu_66"       "trnas-gcu_43"       "trnat-cgu_33"      
## [1534] "trnas-cga_8"        "trnas-gcu_44"       "trnas-cga_9"       
## [1537] "trnas-gcu_45"       "trnat-cgu_34"       "trnas-gcu_46"      
## [1540] "trnat-cgu_35"       "trnas-cga_10"       "trnas-gcu_47"      
## [1543] "trnas-cga_11"       "trnas-cga_12"       "trnas-cga_13"      
## [1546] "trnas-cga_14"       "trnat-agu_67"       "trnas-cga_15"      
## [1549] "trnas-cga_16"       "trnas-aga_29"       "trnas-uga_13"      
## [1552] "trnat-agu_68"       "trnas-cga_17"       "trnas-cga_18"      
## [1555] "trnas-cga_19"       "trnas-cga_20"       "trnat-agu_69"      
## [1558] "trnas-cga_21"       "trnas-cga_22"       "trnat-agu_70"      
## [1561] "trnas-cga_23"       "trnas-cga_24"       "trnas-uga_14"      
## [1564] "trnas-uga_15"       "trnas-cga_25"       "trnas-cga_26"      
## [1567] "trnat-agu_71"       "trnas-cga_27"       "trnas-cga_28"      
## [1570] "trnat-agu_72"       "trnas-uga_16"       "trnas-cga_29"      
## [1573] "trnat-agu_73"       "trnas-cga_30"       "trnat-agu_74"      
## [1576] "trnas-cga_31"       "trnas-cga_32"       "trnas-uga_17"      
## [1579] "trnat-agu_75"       "trnas-cga_33"       "trnas-cga_34"      
## [1582] "trnas-cga_35"       "trnas-cga_36"       "trnat-agu_76"      
## [1585] "trnas-cga_37"       "trnat-agu_77"       "trnas-aga_30"      
## [1588] "trnas-cga_38"       "trnas-cga_39"       "trnas-cga_40"      
## [1591] "trnas-cga_41"       "trnat-agu_78"       "trnas-cga_42"      
## [1594] "trnas-cga_43"       "trnat-agu_79"       "trnas-cga_44"      
## [1597] "trnas-cga_45"       "trnas-cga_46"       "trnas-cga_47"      
## [1600] "trnas-uga_18"       "trnas-cga_48"       "trnas-cga_49"      
## [1603] "trnat-agu_80"       "trnas-cga_50"       "trnas-cga_51"      
## [1606] "trnal-aag_19"       "trnat-ugu_40"       "trnat-cgu_36"      
## [1609] "trnas-cga_52"       "trnat-cgu_37"       "trnas-uga_19"      
## [1612] "trnav-cac_61"       "trnaa-ugc_25"       "trnaa-ugc_26"      
## [1615] "trnaa-cgc_1"        "trnaa-cgc_2"        "trnaa-ugc_27"      
## [1618] "trnaa-agc_17"       "trnat-ugu_41"       "trnat-ugu_42"      
## [1621] "trnat-ugu_43"       "trnat-ugu_44"       "trnat-ugu_45"      
## [1624] "trnat-ugu_46"       "trnat-ugu_47"       "trnat-ugu_48"      
## [1627] "trnat-ugu_49"       "trnat-ugu_50"       "trnat-ugu_51"      
## [1630] "trnat-ugu_52"       "trnat-ugu_53"       "trnai-uau_29"      
## [1633] "trnat-ugu_54"       "trnat-ugu_55"       "trnav-uac_2"       
## [1636] "trnav-uac_3"        "trnav-uac_4"        "trnaw-cca_18"      
## [1639] "trnaw-cca_19"       "trnaw-cca_20"       "trnaw-cca_21"      
## [1642] "trnaw-cca_22"       "trnav-cac_62"       "trnat-ugu_56"      
## [1645] "trnas-uga_20"       "trnat-agu_81"       "trnaa-cgc_3"       
## [1648] "trnaa-ugc_28"       "trnat-agu_82"       "trnat-ugu_57"      
## [1651] "trnaw-cca_23"       "trnaa-ugc_29"       "trnaw-cca_24"      
## [1654] "trnat-ugu_58"       "trnaw-cca_25"       "trnaa-ugc_30"      
## [1657] "trnaw-cca_26"       "trnaa-ugc_31"       "trnat-ugu_59"      
## [1660] "trnaw-cca_27"       "trnaw-cca_28"       "trnaa-ugc_32"      
## [1663] "trnam-cau_79"       "trnaa-ugc_33"       "trnaw-cca_29"      
## [1666] "trnaq-uug_34"       "trnaw-cca_30"       "trnaq-uug_35"      
## [1669] "trnaa-ugc_34"       "trnaw-cca_31"       "trnaw-cca_32"      
## [1672] "trnaq-uug_36"       "trnat-ugu_60"       "trnaq-uug_37"      
## [1675] "trnaa-cgc_4"        "trnaw-cca_33"       "trnat-ugu_61"      
## [1678] "trnaw-cca_34"       "trnat-ugu_62"       "trnaw-cca_35"      
## [1681] "trnat-ugu_63"       "trnaa-ugc_35"       "trnaw-cca_36"      
## [1684] "trnaw-cca_37"       "trnat-ugu_64"       "trnaa-ugc_36"      
## [1687] "trnat-agu_83"       "trnaw-cca_38"       "trnaq-uug_38"      
## [1690] "trnat-agu_84"       "trnaw-cca_39"       "trnaq-uug_39"      
## [1693] "trnaa-ugc_37"       "trnat-agu_85"       "trnaw-cca_40"      
## [1696] "trnaa-ugc_38"       "trnat-agu_86"       "trnai-aau_51"      
## [1699] "trnaq-cug_19"       "trnaq-uug_40"       "trnai-aau_52"      
## [1702] "trnaq-uug_41"       "trnaq-uug_42"       "trnai-aau_53"      
## [1705] "trnaq-uug_43"       "trnaq-uug_44"       "trnai-aau_54"      
## [1708] "trnaq-uug_45"       "trnaq-uug_46"       "trnai-aau_55"      
## [1711] "trnaq-cug_20"       "trnaq-uug_47"       "trnai-aau_56"      
## [1714] "trnaq-cug_21"       "trnaq-uug_48"       "trnai-aau_57"      
## [1717] "trnaq-cug_22"       "trnaq-uug_49"       "trnai-aau_58"      
## [1720] "trnaq-cug_23"       "trnaq-uug_50"       "trnai-aau_59"      
## [1723] "trnaq-uug_51"       "trnaq-uug_52"       "trnai-aau_60"      
## [1726] "trnaq-cug_24"       "trnaq-uug_53"       "trnai-aau_61"      
## [1729] "trnaq-cug_25"       "trnaq-uug_54"       "trnai-aau_62"      
## [1732] "trnaq-uug_55"       "trnai-aau_63"       "trnaq-cug_26"      
## [1735] "trnaq-uug_56"       "trnai-aau_64"       "trnaq-uug_57"      
## [1738] "trnaq-uug_58"       "trnai-aau_65"       "trnaq-cug_27"      
## [1741] "trnaq-uug_59"       "trnai-aau_66"       "trnaq-cug_28"      
## [1744] "trnaq-uug_60"       "trnaq-uug_61"       "trnai-aau_67"      
## [1747] "trnaq-uug_62"       "trnai-aau_68"       "trnaa-ugc_39"      
## [1750] "trnaa-ugc_40"       "trnaa-cgc_5"        "trnaa-ugc_41"      
## [1753] "trnaa-ugc_42"       "trnaa-cgc_6"        "trnaa-ugc_43"      
## [1756] "trnaa-cgc_7"        "trnaa-ugc_44"       "trnaa-cgc_8"       
## [1759] "trnaa-ugc_45"       "trnaa-ugc_46"       "trnaa-cgc_9"       
## [1762] "trnaa-ugc_47"       "trnaa-cgc_10"       "trnaa-ugc_48"      
## [1765] "trnaa-cgc_11"       "trnaa-ugc_49"       "trnaa-cgc_12"      
## [1768] "trnaa-ugc_50"       "trnaa-cgc_13"       "trnaa-ugc_51"      
## [1771] "trnaa-ugc_52"       "trnaa-ugc_53"       "trnaa-cgc_14"      
## [1774] "trnaa-ugc_54"       "trnaa-ugc_55"       "trnaa-cgc_15"      
## [1777] "trnaa-ugc_56"       "trnaa-cgc_16"       "trnaa-ugc_57"      
## [1780] "trnaa-ugc_58"       "trnal-cag_23"       "trnal-caa_1"       
## [1783] "trnaq-cug_29"       "trnal-caa_2"        "trnad-guc_90"      
## [1786] "trnad-guc_91"       "trnai-aau_69"       "trnaa-cgc_17"      
## [1789] "trnaa-cgc_18"       "trnaa-cgc_19"       "trnaa-cgc_20"      
## [1792] "trnaa-cgc_21"       "trnaa-cgc_22"       "trnaa-cgc_23"      
## [1795] "trnam-cau_80"       "trnat-agu_87"       "trnag-gcc_68"      
## [1798] "trnag-gcc_69"       "trnag-gcc_70"       "trnag-gcc_71"      
## [1801] "trnag-gcc_72"       "trnag-gcc_73"       "trnar-ucu_2"       
## [1804] "trnal-uaa_15"       "trnar-ccg_1"        "trnaq-cug_30"      
## [1807] "trnar-ccg_2"        "trnan-guu_33"       "trnaf-gaa_43"      
## [1810] "trnat-ugu_65"       "trnat-ugu_66"       "trnat-ugu_67"      
## [1813] "trnat-ugu_68"       "trnat-ugu_69"       "trnat-ugu_70"      
## [1816] "trnat-cgu_38"       "trnai-aau_70"       "trnai-aau_71"      
## [1819] "trnai-aau_72"       "trnad-guc_92"       "trnad-guc_93"      
## [1822] "trnal-caa_3"        "trnaq-uug_63"       "trnal-caa_4"       
## [1825] "trnal-caa_5"        "trnal-caa_6"        "trnaq-cug_31"      
## [1828] "trnal-caa_7"        "trnal-cag_24"       "trnaa-ugc_59"      
## [1831] "trnal-caa_8"        "trnaa-ugc_60"       "trnaa-cgc_24"      
## [1834] "trnal-caa_9"        "trnal-caa_10"       "trnaa-cgc_25"      
## [1837] "trnal-caa_11"       "trnaa-cgc_26"       "trnai-aau_73"      
## [1840] "trnal-caa_12"       "trnaa-ugc_61"       "trnal-caa_13"      
## [1843] "trnat-agu_88"       "trnaw-cca_41"       "trnaa-ugc_62"      
## [1846] "trnat-cgu_39"       "trnaw-cca_42"       "trnat-ugu_71"      
## [1849] "trnat-agu_89"       "trnaa-ugc_63"       "trnaa-cgc_27"      
## [1852] "trnat-agu_90"       "trnas-uga_21"       "trnav-cac_63"      
## [1855] "trnaw-cca_43"       "trnaw-cca_44"       "trnaw-cca_45"      
## [1858] "trnaw-cca_46"       "trnaw-cca_47"       "trnaw-cca_48"      
## [1861] "trnav-uac_5"        "trnav-uac_6"        "trnav-uac_7"       
## [1864] "trnav-uac_8"        "trnav-uac_9"        "trnav-uac_10"      
## [1867] "trnat-ugu_72"       "trnat-ugu_73"       "trnat-ugu_74"      
## [1870] "trnat-ugu_75"       "trnat-ugu_76"       "trnat-ugu_77"      
## [1873] "trnaa-agc_18"       "trnaa-ugc_64"       "trnac-gca_2"       
## [1876] "trnam-cau_81"       "trnat-agu_91"       "trnag-gcc_74"      
## [1879] "trnag-gcc_75"       "trnag-gcc_76"       "trnar-ucu_3"       
## [1882] "trnar-ccg_3"        "trnar-ccg_4"        "trnar-ccg_5"       
## [1885] "trnan-guu_34"       "trnaf-gaa_44"       "trnat-ugu_78"      
## [1888] "trnat-ugu_79"       "trnac-gca_3"        "trnac-gca_4"       
## [1891] "trnac-gca_5"        "trnac-gca_6"        "trnac-gca_7"       
## [1894] "trnai-uau_30"       "trnas-gcu_48"       "trnal-uaa_16"      
## [1897] "trnal-uaa_17"       "trnal-uaa_18"       "trnal-uaa_19"      
## [1900] "trnal-uaa_20"       "trnal-uaa_21"       "trnal-uaa_22"      
## [1903] "trnal-uaa_23"       "trnal-uaa_24"       "trnal-uaa_25"      
## [1906] "trnal-uaa_26"       "trnal-uaa_27"       "trnal-uaa_28"      
## [1909] "trnal-uaa_29"       "trnal-uaa_30"       "trnal-uaa_31"      
## [1912] "trnal-uaa_32"       "trnal-uaa_33"       "trnal-uaa_34"      
## [1915] "trnal-uaa_35"       "trnal-uaa_36"       "trnal-uaa_37"      
## [1918] "trnal-uaa_38"       "trnal-uaa_39"       "trnal-uaa_40"      
## [1921] "trnal-uaa_41"       "trnal-uaa_42"       "trnal-uaa_43"      
## [1924] "trnal-uaa_44"       "trnal-uaa_45"       "trnal-uaa_46"      
## [1927] "trnal-uaa_47"       "trnal-uaa_48"       "trnal-uaa_49"      
## [1930] "trnal-uaa_50"       "trnal-uaa_51"       "trnal-uaa_52"      
## [1933] "trnal-uaa_53"       "trnal-uaa_54"       "trnal-uaa_55"      
## [1936] "trnal-uaa_56"       "trnal-uaa_57"       "trnal-uaa_58"      
## [1939] "trnal-uaa_59"       "trnal-uaa_60"       "trnal-uaa_61"      
## [1942] "trnal-uaa_62"       "trnae-cuc_60"       "trnac-gca_8"       
## [1945] "trnas-aga_31"       "trnal-cag_25"       "trnae-uuc_28"      
## [1948] "trnaa-agc_19"       "trnay-gua_45"       "trnaa-agc_20"      
## [1951] "trnay-gua_46"       "trnaa-agc_21"       "trnae-uuc_29"      
## [1954] "trnae-uuc_30"       "trnay-gua_47"       "trnaa-agc_22"      
## [1957] "trnaa-agc_23"       "trnaa-agc_24"       "trnay-gua_48"      
## [1960] "trnae-uuc_31"       "trnaa-agc_25"       "trnaa-agc_26"      
## [1963] "trnae-uuc_32"       "trnaa-agc_27"       "trnay-gua_49"      
## [1966] "trnae-uuc_33"       "trnaa-agc_28"       "trnay-gua_50"      
## [1969] "trnay-gua_51"       "trnae-uuc_34"       "trnae-uuc_35"      
## [1972] "trnay-gua_52"       "trnae-uuc_36"       "trnae-uuc_37"      
## [1975] "trnas-gcu_49"       "trnai-uau_31"       "trnac-gca_9"       
## [1978] "trnac-gca_10"       "trnac-gca_11"       "trnac-gca_12"      
## [1981] "trnac-gca_13"       "trnac-gca_14"       "trnac-gca_15"      
## [1984] "trnal-uaa_63"       "trnal-uaa_64"       "trnal-uaa_65"      
## [1987] "trnal-uaa_66"       "trnal-uaa_67"       "trnal-uaa_68"      
## [1990] "trnal-uaa_69"       "trnal-uaa_70"       "trnaa-cgc_28"      
## [1993] "trnas-aga_32"       "trnay-gua_53"       "trnay-gua_54"      
## [1996] "trnae-uuc_38"       "trnay-gua_55"       "trnay-gua_56"      
## [1999] "trnay-gua_57"       "trnae-uuc_39"       "trnay-gua_58"      
## [2002] "trnaa-agc_29"       "trnay-gua_59"       "trnay-gua_60"      
## [2005] "trnay-gua_61"       "trnaa-agc_30"       "trnaa-agc_31"      
## [2008] "trnaa-agc_32"       "trnaa-agc_33"       "trnaa-agc_34"      
## [2011] "trnay-gua_62"       "trnaa-agc_35"       "trnay-gua_63"      
## [2014] "trnaa-agc_36"       "trnaa-agc_37"       "trnay-gua_64"      
## [2017] "trnay-gua_65"       "trnay-gua_66"       "trnaa-agc_38"      
## [2020] "trnae-uuc_40"       "trnay-gua_67"       "trnay-gua_68"      
## [2023] "trnay-gua_69"       "trnae-uuc_41"       "trnae-uuc_42"      
## [2026] "trnay-gua_70"       "trnae-uuc_43"       "trnae-uuc_44"      
## [2029] "trnan-guu_35"       "trnat-agu_92"       "trnaa-agc_39"      
## [2032] "trnav-uac_11"       "trnav-uac_12"       "trnav-uac_13"      
## [2035] "trnag-ucc_37"       "trnav-uac_14"       "trnav-uac_15"      
## [2038] "trnan-guu_36"       "trnak-uuu_71"       "trnam-cau_82"      
## [2041] "trnaa-agc_40"       "trnav-uac_16"       "trnav-uac_17"      
## [2044] "trnav-uac_18"       "trnav-uac_19"       "trnav-uac_20"      
## [2047] "trnav-uac_21"       "trnav-uac_22"       "trnav-uac_23"      
## [2050] "trnav-uac_24"       "trnav-uac_25"       "trnav-uac_26"      
## [2053] "trnav-uac_27"       "trnav-uac_28"       "trnav-uac_29"      
## [2056] "trnav-uac_30"       "trnav-uac_31"       "trnav-uac_32"      
## [2059] "trnav-uac_33"       "trnav-uac_34"       "trnav-uac_35"      
## [2062] "trnav-uac_36"       "trnav-uac_37"       "trnav-uac_38"      
## [2065] "trnav-uac_39"       "trnav-uac_40"       "trnav-uac_41"      
## [2068] "trnav-uac_42"       "trnav-uac_43"       "trnav-uac_44"      
## [2071] "trnav-uac_45"       "trnav-uac_46"       "trnav-uac_47"      
## [2074] "trnaq-cug_32"       "trnaf-gaa_45"       "trnaf-gaa_46"      
## [2077] "trnal-cag_26"       "trnaf-gaa_47"       "trnal-cag_27"      
## [2080] "trnal-uag_10"       "trnaf-gaa_48"       "trnal-cag_28"      
## [2083] "trnar-ccu_52"       "trnal-cag_29"       "trnar-ccu_53"      
## [2086] "trnar-ccu_54"       "trnal-cag_30"       "trnal-cag_31"      
## [2089] "trnal-cag_32"       "trnal-cag_33"       "trnaf-gaa_49"      
## [2092] "trnal-cag_34"       "trnaf-gaa_50"       "trnal-cag_35"      
## [2095] "trnal-cag_36"       "trnaf-gaa_51"       "trnal-cag_37"      
## [2098] "trnaf-gaa_52"       "trnal-cag_38"       "trnal-uag_11"      
## [2101] "trnas-gcu_50"       "trnas-gcu_51"       "trnal-aag_20"      
## [2104] "trnal-uag_12"       "trnap-cgg_4"        "trnan-guu_37"      
## [2107] "trnal-aag_21"       "trnas-gcu_52"       "trnal-aag_22"      
## [2110] "trnal-uag_13"       "trnap-cgg_5"        "trnan-guu_38"      
## [2113] "trnas-gcu_53"       "trnal-aag_23"       "trnal-uag_14"      
## [2116] "trnap-cgg_6"        "trnan-guu_39"       "trnal-aag_24"      
## [2119] "trnal-uag_15"       "trnap-cgg_7"        "trnan-guu_40"      
## [2122] "trnas-gcu_54"       "trnal-aag_25"       "trnal-uag_16"      
## [2125] "trnap-cgg_8"        "trnan-guu_41"       "trnap-cgg_9"       
## [2128] "trnan-guu_42"       "trnal-aag_26"       "trnas-gcu_55"      
## [2131] "trnal-aag_27"       "trnap-cgg_10"       "trnan-guu_43"      
## [2134] "trnas-gcu_56"       "trnal-aag_28"       "trnal-uag_17"      
## [2137] "trnap-cgg_11"       "trnan-guu_44"       "trnal-aag_29"      
## [2140] "trnal-uag_18"       "trnal-uag_19"       "trnap-cgg_12"      
## [2143] "trnan-guu_45"       "trnas-gcu_57"       "trnal-aag_30"      
## [2146] "trnal-uag_20"       "trnap-cgg_13"       "trnan-guu_46"      
## [2149] "trnas-gcu_58"       "trnal-aag_31"       "trnal-uag_21"      
## [2152] "trnap-cgg_14"       "trnan-guu_47"       "trnas-gcu_59"      
## [2155] "trnal-aag_32"       "trnal-uag_22"       "trnap-cgg_15"      
## [2158] "trnan-guu_48"       "trnas-gcu_60"       "trnal-aag_33"      
## [2161] "trnal-uag_23"       "trnap-cgg_16"       "trnan-guu_49"      
## [2164] "trnas-gcu_61"       "trnal-aag_34"       "trnal-uag_24"      
## [2167] "trnap-cgg_17"       "trnan-guu_50"       "trnam-cau_83"      
## [2170] "trnas-gcu_62"       "trnal-aag_35"       "trnal-uag_25"      
## [2173] "trnap-cgg_18"       "trnan-guu_51"       "trnam-cau_84"      
## [2176] "trnas-gcu_63"       "trnal-aag_36"       "trnal-uag_26"      
## [2179] "trnap-cgg_19"       "trnan-guu_52"       "trnam-cau_85"      
## [2182] "trnas-gcu_64"       "trnal-aag_37"       "trnal-uag_27"      
## [2185] "trnap-cgg_20"       "trnan-guu_53"       "trnam-cau_86"      
## [2188] "trnas-gcu_65"       "trnas-gcu_66"       "trnal-aag_38"      
## [2191] "trnae-cuc_61"       "trnav-cac_64"       "trnar-acg_75"      
## [2194] "trnar-ccu_55"       "trnav-aac_64"       "trnae-cuc_62"      
## [2197] "trnav-cac_65"       "trnar-acg_76"       "trnar-ccu_56"      
## [2200] "trnav-aac_65"       "trnae-cuc_63"       "trnav-cac_66"      
## [2203] "trnar-acg_77"       "trnar-ccu_57"       "trnav-aac_66"      
## [2206] "trnae-cuc_64"       "trnav-cac_67"       "trnar-acg_78"      
## [2209] "trnas-gcu_67"       "trnas-gcu_68"       "trnas-gcu_69"      
## [2212] "trnam-cau_87"       "trnan-guu_54"       "trnap-cgg_21"      
## [2215] "trnal-uag_28"       "trnal-aag_39"       "trnas-gcu_70"      
## [2218] "trnam-cau_88"       "trnan-guu_55"       "trnap-cgg_22"      
## [2221] "trnal-uag_29"       "trnal-aag_40"       "trnas-gcu_71"      
## [2224] "trnam-cau_89"       "trnan-guu_56"       "trnap-cgg_23"      
## [2227] "trnal-uag_30"       "trnal-aag_41"       "trnas-gcu_72"      
## [2230] "trnan-guu_57"       "trnai-uau_32"       "trnaf-gaa_53"      
## [2233] "trnar-ccu_58"       "trnal-cag_39"       "trnaf-gaa_54"      
## [2236] "trnal-cag_40"       "trnaf-gaa_55"       "trnal-cag_41"      
## [2239] "trnaf-gaa_56"       "trnal-uag_31"       "trnaf-gaa_57"      
## [2242] "trnal-cag_42"       "trnar-ccu_59"       "trnal-cag_43"      
## [2245] "trnar-ccu_60"       "trnaf-gaa_58"       "trnal-cag_44"      
## [2248] "trnal-cag_45"       "trnal-cag_46"       "trnaf-gaa_59"      
## [2251] "trnar-ucu_4"        "trnar-ucu_5"        "trnar-ucu_6"       
## [2254] "trnar-ucu_7"        "trnar-ucu_8"        "trnak-uuu_72"      
## [2257] "trnar-ucu_9"        "trnar-ucu_10"       "trnar-ucu_11"      
## [2260] "trnak-uuu_73"       "trnar-ucu_12"       "trnar-ucu_13"      
## [2263] "trnar-ucu_14"       "trnar-ucu_15"       "trnaq-cug_33"      
## [2266] "trnaq-uug_64"       "trnaq-cug_34"       "trnas-aga_33"      
## [2269] "trnaq-uug_65"       "trnaq-cug_35"       "trnas-aga_34"      
## [2272] "trnal-caa_14"       "trnav-uac_48"       "trnar-ucg_1"       
## [2275] "trnar-ucg_2"        "trnal-caa_15"       "trnar-ucg_3"       
## [2278] "trnar-ucg_4"        "trnal-caa_16"       "trnar-ucg_5"       
## [2281] "trnar-ucg_6"        "trnal-caa_17"       "trnar-ucg_7"       
## [2284] "trnar-ucg_8"        "trnal-caa_18"       "trnar-ucg_9"       
## [2287] "trnar-ucg_10"       "trnal-caa_19"       "trnar-ucg_11"      
## [2290] "trnar-ucg_12"       "trnal-caa_20"       "trnar-ucg_13"      
## [2293] "trnar-ucg_14"       "trnal-caa_21"       "trnar-ucg_15"      
## [2296] "trnar-ucg_16"       "trnal-caa_22"       "trnar-ucg_17"      
## [2299] "trnar-ucg_18"       "trnal-caa_23"       "trnar-ucg_19"      
## [2302] "trnar-ucg_20"       "trnal-caa_24"       "trnar-ucg_21"      
## [2305] "trnar-ucg_22"       "trnaa-agc_41"       "trnae-cuc_65"      
## [2308] "trnag-ucc_38"       "trnan-guu_58"       "trnal-uaa_71"      
## [2311] "trnae-cuc_66"       "trnan-guu_59"       "trnar-ucg_23"      
## [2314] "trnai-uau_33"       "trnak-cuu_89"       "trnak-cuu_90"      
## [2317] "trnai-aau_74"       "trnas-gcu_73"       "trnar-ucu_16"      
## [2320] "trnag-ucc_39"       "trnae-uuc_45"       "trnai-aau_75"      
## [2323] "trnak-cuu_91"       "trnar-ucu_17"       "trnak-uuu_74"      
## [2326] "trnal-caa_25"       "trnav-uac_49"       "trnag-gcc_77"      
## [2329] "trnag-gcc_78"       "trnag-gcc_79"       "trnaa-agc_42"      
## [2332] "trnae-cuc_67"       "trnag-ucc_40"       "trnan-guu_60"      
## [2335] "trnal-uaa_72"       "trnae-cuc_68"       "trnae-cuc_69"      
## [2338] "trnan-guu_61"       "trnal-uaa_73"       "trnar-ucg_24"      
## [2341] "trnai-uau_34"       "trnak-uuu_75"       "trnai-uau_35"      
## [2344] "trnal-uaa_74"       "trnar-ucg_25"       "trnai-uau_36"      
## [2347] "trnal-uaa_75"       "trnae-cuc_70"       "trnaq-uug_66"      
## [2350] "trnai-uau_37"       "trnar-ucg_26"       "trnai-uau_38"      
## [2353] "trnai-uau_39"       "trnar-ucg_27"       "trnastop-uca_1"    
## [2356] "trnas-gcu_74"       "trnar-ucu_18"       "trnag-ucc_41"      
## [2359] "trnak-cuu_92"       "trnas-gcu_75"       "trnac-gca_16"      
## [2362] "trnac-gca_17"       "trnac-gca_18"       "trnac-gca_19"      
## [2365] "trnac-gca_20"       "trnac-gca_21"       "trnac-gca_22"      
## [2368] "trnac-gca_23"       "trnac-gca_24"       "trnac-gca_25"      
## [2371] "trnac-gca_26"       "trnac-gca_27"       "trnac-gca_28"      
## [2374] "trnac-gca_29"       "trnac-gca_30"       "trnac-gca_31"      
## [2377] "trnac-gca_32"       "trnaq-cug_36"       "trnaq-cug_37"      
## [2380] "trnas-aga_35"       "trnas-aga_36"       "trnaq-uug_67"      
## [2383] "trnar-ucu_19"       "trnas-aga_37"       "trnas-uga_22"      
## [2386] "trnaq-cug_38"       "trnas-aga_38"       "trnan-guu_62"      
## [2389] "trnar-ucu_20"       "trnas-uga_23"       "trnaq-cug_39"      
## [2392] "trnas-aga_39"       "trnan-guu_63"       "trnas-uga_24"      
## [2395] "trnaq-cug_40"       "trnas-aga_40"       "trnav-cac_68"      
## [2398] "trnaq-uug_68"       "trnaq-cug_41"       "trnas-aga_41"      
## [2401] "trnaq-uug_69"       "trnaq-cug_42"       "trnan-guu_64"      
## [2404] "trnaq-cug_43"       "trnas-uga_25"       "trnas-uga_26"      
## [2407] "trnas-aga_42"       "trnan-guu_65"       "trnav-aac_67"      
## [2410] "trnaq-cug_44"       "trnas-aga_43"       "trnaq-uug_70"      
## [2413] "trnan-guu_66"       "trnas-aga_44"       "trnac-gca_33"      
## [2416] "trnam-cau_90"       "trnar-ccu_61"       "trnar-ucg_28"      
## [2419] "trnap-agg_39"       "trnam-cau_91"       "trnaq-cug_45"      
## [2422] "trnaq-uug_71"       "trnas-aga_45"       "trnar-ccg_6"       
## [2425] "trnaa-cgc_29"       "trnag-gcc_80"       "trnat-cgu_40"      
## [2428] "trnal-uag_32"       "trnar-ucu_21"       "trnar-ucu_22"      
## [2431] "trnar-ucu_23"       "trnar-ucu_24"       "trnar-ucu_25"      
## [2434] "trnar-ucu_26"       "trnar-ucu_27"       "trnar-ucu_28"      
## [2437] "trnar-ucu_29"       "trnar-ucu_30"       "trnar-ucu_31"      
## [2440] "trnar-ucu_32"       "trnar-ucu_33"       "trnar-ucu_34"      
## [2443] "trnar-ucu_35"       "trnar-ucu_36"       "trnar-ucu_37"      
## [2446] "trnar-ucu_38"       "trnar-ucu_39"       "trnar-ucu_40"      
## [2449] "trnar-ucu_41"       "trnar-ucu_42"       "trnar-ucu_43"      
## [2452] "trnar-ucu_44"       "trnar-ucu_45"       "trnar-ucu_46"      
## [2455] "trnar-ucu_47"       "trnar-ucu_48"       "trnar-ucu_49"      
## [2458] "trnak-uuu_76"       "trnar-ucu_50"       "trnar-ucu_51"      
## [2461] "trnar-ucu_52"       "trnar-ucu_53"       "trnar-ucu_54"      
## [2464] "trnar-ucu_55"       "trnar-ucu_56"       "trnar-ucu_57"      
## [2467] "trnar-ucu_58"       "trnar-ucu_59"       "trnar-ucu_60"      
## [2470] "trnar-ucu_61"       "trnar-ucu_62"       "trnar-ucu_63"      
## [2473] "trnar-ucu_64"       "trnar-ucu_65"       "trnar-ucu_66"      
## [2476] "trnar-ucu_67"       "trnar-ucu_68"       "trnar-ucu_69"      
## [2479] "trnar-ucu_70"       "trnar-ucu_71"       "trnar-ucu_72"      
## [2482] "trnar-ucu_73"       "trnar-ucu_74"       "trnar-ucu_75"      
## [2485] "trnar-ucu_76"       "trnar-ucu_77"       "trnar-ucu_78"      
## [2488] "trnar-ucu_79"       "trnar-ucu_80"       "trnar-ucu_81"      
## [2491] "trnar-ucu_82"       "trnar-ucu_83"       "trnar-ucu_84"      
## [2494] "trnar-ucu_85"       "trnar-ucu_86"       "trnar-ucu_87"      
## [2497] "trnar-ucu_88"       "trnar-ucu_89"       "trnar-ucu_90"      
## [2500] "trnar-ucu_91"       "trnar-ucu_92"       "trnai-uau_40"      
## [2503] "trnak-uuu_77"       "trnag-ucc_42"       "trnac-gca_34"      
## [2506] "trnac-gca_35"       "trnac-gca_36"       "trnac-gca_37"      
## [2509] "trnac-gca_38"       "trnac-gca_39"       "trnac-gca_40"      
## [2512] "trnaq-cug_46"       "trnas-aga_46"       "trnas-aga_47"      
## [2515] "trnan-guu_67"       "trnas-aga_48"       "trnaq-uug_72"      
## [2518] "trnav-cac_69"       "trnaq-cug_47"       "trnav-aac_68"      
## [2521] "trnan-guu_68"       "trnas-aga_49"       "trnaq-uug_73"      
## [2524] "trnav-cac_70"       "trnaq-cug_48"       "trnav-aac_69"      
## [2527] "trnan-guu_69"       "trnas-aga_50"       "trnaq-uug_74"      
## [2530] "trnav-cac_71"       "trnaq-cug_49"       "trnar-ucu_93"      
## [2533] "trnan-guu_70"       "trnas-uga_27"       "trnas-uga_28"      
## [2536] "trnas-aga_51"       "trnan-guu_71"       "trnav-aac_70"      
## [2539] "trnaq-cug_50"       "trnav-cac_72"       "trnas-aga_52"      
## [2542] "trnaq-uug_75"       "trnan-guu_72"       "trnac-gca_41"      
## [2545] "trnac-gca_42"       "trnac-gca_43"       "trnam-cau_92"      
## [2548] "trnar-ccu_62"       "trnar-ucg_29"       "trnam-cau_93"      
## [2551] "trnaq-cug_51"       "trnaq-uug_76"       "trnas-aga_53"      
## [2554] "trnar-ccg_7"        "trnaa-cgc_30"       "trnag-gcc_81"      
## [2557] "trnat-cgu_41"       "trnal-uag_33"       "trnak-uuu_78"      
## [2560] "trnag-ucc_43"       "ccdc50.L_1"         "unassigned_gene_1" 
## [2563] "unassigned_gene_2"  "unassigned_gene_3"  "unassigned_gene_4" 
## [2566] "unassigned_gene_5"  "unassigned_gene_6"  "unassigned_gene_7" 
## [2569] "unassigned_gene_8"  "unassigned_gene_9"  "unassigned_gene_10"
## [2572] "unassigned_gene_11" "unassigned_gene_12" "unassigned_gene_13"
## [2575] "unassigned_gene_14" "unassigned_gene_15" "unassigned_gene_16"
## [2578] "unassigned_gene_17" "unassigned_gene_18" "unassigned_gene_19"
## [2581] "unassigned_gene_20" "unassigned_gene_21" "unassigned_gene_22"
## [2584] "unassigned_gene_23" "unassigned_gene_24"

Bunch of tRNAs. Please don’t stop here when the list is cut-off. Let’s check whether tRNA annotations are the only issues here:

setdiff(
  grep("_", rownames(xenopus.data), value = T),
  grep("^trna", rownames(xenopus.data), value = T)
)
##  [1] "ccdc50.L_1"         "unassigned_gene_1"  "unassigned_gene_2" 
##  [4] "unassigned_gene_3"  "unassigned_gene_4"  "unassigned_gene_5" 
##  [7] "unassigned_gene_6"  "unassigned_gene_7"  "unassigned_gene_8" 
## [10] "unassigned_gene_9"  "unassigned_gene_10" "unassigned_gene_11"
## [13] "unassigned_gene_12" "unassigned_gene_13" "unassigned_gene_14"
## [16] "unassigned_gene_15" "unassigned_gene_16" "unassigned_gene_17"
## [19] "unassigned_gene_18" "unassigned_gene_19" "unassigned_gene_20"
## [22] "unassigned_gene_21" "unassigned_gene_22" "unassigned_gene_23"
## [25] "unassigned_gene_24"

There are some unassigned genes, but then there is one particular gene ccdc50.L

grep("ccdc50", rownames(xenopus.data), value = T)
## [1] "ccdc50.L"   "ccdc50.S"   "ccdc50.L_1"
grep("ccdc50", rownames(xenopus), value = T)
## [1] "ccdc50.L"   "ccdc50.S"   "ccdc50.L-1"

Check whether ccdc50.L is an important gene, or whether ccdc50.L_1 is a real separate gene in Xenbase as you learned on Friday.

6 Extracting Gene-level meta information

6.1 Mitochondrial genes

We have checked that the up-to-date assembly contains the mitochondrial genome, and their mitochondrial genes are annotated (see previous section). The following is to trim down further to check out mRNAs that are poly adenylated.

gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '( $1 == "chrM" && $3 == "transcript"  )' | grep -v tRNA
## chrM RefSeq  transcript  2205    3023    .   +   .   gene_id "unassigned_gene_2"; transcript_id "unassigned_transcript_2833"; gbkey "rRNA"; product "12S ribosomal RNA"; transcript_biotype "rRNA";
## chrM RefSeq  transcript  3093    4723    .   +   .   gene_id "unassigned_gene_4"; transcript_id "unassigned_transcript_2835"; gbkey "rRNA"; product "16S ribosomal RNA"; transcript_biotype "rRNA";
## chrM RefSeq  transcript  4799    5770    .   +   .   gene_id "ND1"; transcript_id "unassigned_transcript_2837"; gbkey "mRNA"; gene "ND1"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  5979    7016    .   +   .   gene_id "ND2"; transcript_id "unassigned_transcript_2841"; gbkey "mRNA"; gene "ND2"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  7397    8951    .   +   .   gene_id "COX1"; transcript_id "unassigned_transcript_2847"; gbkey "mRNA"; gene "COX1"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  9109    9796    .   +   .   gene_id "COX2"; transcript_id "unassigned_transcript_2850"; gbkey "mRNA"; gene "COX2"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  9873    10040   .   +   .   gene_id "ATP8"; transcript_id "unassigned_transcript_2852"; gbkey "mRNA"; gene "ATP8"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  10031   10711   .   +   .   gene_id "ATP6"; transcript_id "unassigned_transcript_2853"; gbkey "mRNA"; gene "ATP6"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  10711   11491   .   +   .   gene_id "COX3"; transcript_id "unassigned_transcript_2854"; gbkey "mRNA"; gene "COX3"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  11562   11904   .   +   .   gene_id "ND3"; transcript_id "unassigned_transcript_2856"; gbkey "mRNA"; gene "ND3"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  11974   12270   .   +   .   gene_id "ND4L"; transcript_id "unassigned_transcript_2858"; gbkey "mRNA"; gene "ND4L"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  12264   13647   .   +   .   gene_id "ND4"; transcript_id "unassigned_transcript_2859"; gbkey "mRNA"; gene "ND4"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  13855   15669   .   +   .   gene_id "ND5"; transcript_id "unassigned_transcript_2863"; gbkey "mRNA"; gene "ND5"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  15665   16177   .   -   .   gene_id "ND6"; transcript_id "unassigned_transcript_2864"; gbkey "mRNA"; gene "ND6"; transcript_biotype "mRNA";
## chrM RefSeq  transcript  16249   17388   .   +   .   gene_id "CYTB"; transcript_id "unassigned_transcript_2866"; gbkey "mRNA"; gene "CYTB"; transcript_biotype "mRNA";

There are ways to extract the set of gene names that are mitochondrial. One simple way is to manually type them:

mito.genes <- c("ND1", "ND2", "COX1", "COX2", "ATP8", "ATP6", "COX3", "ND3", "ND4L", "ND4", "ND5", "ND6", "CYTB")

mito.genes
##  [1] "ND1"  "ND2"  "COX1" "COX2" "ATP8" "ATP6" "COX3" "ND3"  "ND4L" "ND4" 
## [11] "ND5"  "ND6"  "CYTB"
rownames(xenopus)[rownames(xenopus) %in% mito.genes]
##  [1] "ND1"  "ND2"  "COX1" "COX2" "ATP8" "ATP6" "COX3" "ND3"  "ND4L" "ND4" 
## [11] "ND5"  "ND6"  "CYTB"

The other way is parsing them by code (which I will leave as a advanced Quiz, but we will come back in extracting ribosomal genes).

In any case, when you are compiling an information that is not computed, save it as a text table for your record.

save_table(
  mito.genes, 
  glue::glue("{project.prefix}mito_genes"),
  format="csv"
)

We can now set up the mitochondrial gene fraction.

xenopus[["percent.mt"]] <- Seurat::PercentageFeatureSet( xenopus, features = mito.genes )

xenopus@meta.data

6.2 Ribosomal genes

For ribosomal genes, I have compiled the list from the UniProt database. Download the files.

read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv") %>% head()
## Rows: 530 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (6): Entry, Reviewed, Entry Name, Protein names, Gene Names, Organism
## dbl (1): Length
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv") %>% 
dplyr::count( `Reviewed` )
## Rows: 530 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (6): Entry, Reviewed, Entry Name, Protein names, Gene Names, Organism
## dbl (1): Length
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# this takes out a lot o
provisional  <-
bind_rows(
    read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv", show_col_types = F),
) %>%
dplyr::select(`Gene Names`, `Organism` ) %>%
dplyr::filter( !is.na(`Gene Names`) ) %>%
separate_rows( `Gene Names`, convert = FALSE, sep = " " ) %>%
pull(`Gene Names` )
#dplyr::filter( grepl("14e22", `Gene Names` ) )

provisional
##    [1] "rpsa.S"            "37lrp"             "lambr"            
##    [4] "lamr1"             "LR"                "lrp"              
##    [7] "p40"               "RPSA"              "rpsa"             
##   [10] "rpsa.L"            "37LRP"             "67LR"             
##   [13] "lambr"             "LamR"              "lamr1"            
##   [16] "LBP/p40"           "LR"                "lrp"              
##   [19] "LRP/LR"            "p40"               "RPSA"             
##   [22] "rpsa"              "rpsa.S"            "37lrp"            
##   [25] "lambr"             "lamr1"             "LR"               
##   [28] "lrp"               "p40"               "RPSA"             
##   [31] "rpsa"              "LOC100037080"      "RPSA"             
##   [34] "rps3-a"            "rps6ka"            "rps3-b"           
##   [37] "rpsa"              "ogfod1"            "impact"           
##   [40] "top2b.S"           "top2b.L"           "rps6ka6.S"        
##   [43] "pp90rsk4"          "rps6ka"            "rps6ka1"          
##   [46] "rps6ka6"           "rsk4"              "lonp1.L"          
##   [49] "LONP1"             "rps6ka1.L"         "hu-1"             
##   [52] "mapkapk1a"         "rps6ka1"           "rsk"              
##   [55] "rsk1"              "rps6ka4.L"         "rps6ka4"          
##   [58] "top2b.S"           "top2b.L"           "top2b.S"          
##   [61] "top2b.S"           "rps6ka1.L"         "hu-1"             
##   [64] "mapkapk1a"         "rps6ka1"           "rsk"              
##   [67] "rsk1"              "rps6ka1.L"         "hu-1"             
##   [70] "mapkapk1a"         "rps6ka1"           "rsk"              
##   [73] "rsk1"              "rps6ka1.L"         "hu-1"             
##   [76] "mapkapk1a"         "rps6ka1"           "rsk"              
##   [79] "rsk1"              "rps6ka4.L"         "rps6ka4"          
##   [82] "gfm1"              "efg1"              "rpl4-b"           
##   [85] "rpl1b"             "rpl18-b"           "rpl14b"           
##   [88] "rpl5-a"            "rpl5-b"            "lonp2"            
##   [91] "ptcd3"             "snu13"             "nhp2l1"           
##   [94] "hsp90ab1"          "hsp90beta"         "XELAEV_18028538mg"
##   [97] "riox2.L"           "mina"              "mina-prov"        
##  [100] "mina.L"            "NO52"              "riox2"            
##  [103] "nsa2.L"            "cdk105"            "hcl-g1"           
##  [106] "hclg1"             "hussy-29"          "hussy29"          
##  [109] "nsa2"              "tinp1"             "yr-29"            
##  [112] "mrpl24"            "rps6ka1.L"         "hu-1"             
##  [115] "mapkapk1a"         "MGC81220"          "rps6ka1"          
##  [118] "rsk"               "rsk1"              "top2a.L"          
##  [121] "LOC398512"         "top2"              "top2a"            
##  [124] "tp2a"              "snu13.L"           "fa-1"             
##  [127] "fa1"               "hoip"              "nhp2l1"           
##  [130] "nhp2l1-b"          "nhpx"              "otk27"            
##  [133] "snrnp15-5"         "snu13"             "spag12"           
##  [136] "ssfa1"             "rpls3-b"           "rps6ka3.L"        
##  [139] "p90"               "rps6ka3"           "rsk"              
##  [142] "RSK2"              "rsk2"              "S6KII"            
##  [145] "rps6kb1.S"         "p70-alpha"         "p70-s6k"          
##  [148] "p70s6k"            "ps6k"              "rps6kb1"          
##  [151] "rps6kb1-A"         "s6k"               "s6K1"             
##  [154] "stk14a"            "rps27.L"           "pms1.S"           
##  [157] "pms1"              "LOC108703225"      "eftud2.L"         
##  [160] "LOC108700925"      "rpl27.L"           "LOC108700150"     
##  [163] "rack1.L"           "gnb2-rs1"          "gnb2l1"           
##  [166] "h12.3"             "hlc-7"             "pig21"            
##  [169] "rack1.S"           "mrps18b.L"         "mlh3.L"           
##  [172] "mlh3"              "rps6ka5.L"         "mrpl24.L"         
##  [175] "L24"               "mrpl24"            "rpl24"            
##  [178] "rpl24-A"           "mrto4.S"           "rps9.L"           
##  [181] "rpl15.S"           "LOC121394610"      "rps20.L"          
##  [184] "rps20"             "pnpt1.S"           "LOC108717861"     
##  [187] "rps7.S"            "dba8"              "rps7"             
##  [190] "rps7.L"            "rpS8A"             "rpS8B"            
##  [193] "rps27a.L"          "eprs1.L"           "eprs"             
##  [196] "mrpl49.L"          "mvd.L"             "qars1.L"          
##  [199] "nhp2.S"            "nhp2.L"            "LOC108710993"     
##  [202] "rps27l.L"          "mrpl30.S"          "rps26.S"          
##  [205] "riox2.L"           "mina"              "mina.L"           
##  [208] "NO52"              "rpl8.L"            "mrps15.L"         
##  [211] "rps6kb1.L"         "rps15.S"           "rig"              
##  [214] "rps15"             "mrpl35.L"          "rpl34.L"          
##  [217] "L34"               "rpl34"             "xl34"             
##  [220] "mvk.L"             "rplp0.L"           "rpl37.L"          
##  [223] "gfm2.L"            "EFG2"              "GFM2"             
##  [226] "rpl17.L"           "b"                 "l17"              
##  [229] "pd-1"              "rpl17"             "rpl17-a"          
##  [232] "rpl17-b"           "rpl23"             "rps19.S"          
##  [235] "rps19"             "rpl11.S"           "rpl11"            
##  [238] "rpl11.L"           "mlh3.L"            "mlh3"             
##  [241] "rps25.L"           "rpl35.L"           "l35"              
##  [244] "rpl35"             "LOC108698757"      "mlh3.L"           
##  [247] "mlh3"              "mlh3.L"            "mlh3"             
##  [250] "rpl12.L"           "LOC108698451"      "LOC108698451"     
##  [253] "mrps18b.L"         "LOC108700150"      "MGC114789"        
##  [256] "LOC108703484"      "pms2.L"            "rpl6.S"           
##  [259] "rpl6"              "rpl6.S"            "rpl6"             
##  [262] "rps15a.S"          "rps15a"            "rps22"            
##  [265] "LOC108706570"      "rpl6.S"            "rpl6"             
##  [268] "rps27l.L"          "rps26.S"           "mrps31.S"         
##  [271] "rps27l.S"          "eprs1.L"           "eprs"             
##  [274] "rps6kc1.L"         "rpk118"            "rps6kc1"          
##  [277] "s6pkh1"            "mrps18a.L"         "mrp-s18-3"        
##  [280] "mrps18-3"          "mrps18a"           "s18bmt"           
##  [283] "LOC108717877"      "hsp90ab1.S"        "hsp90ab1"         
##  [286] "hsp90b"            "hsp90beta"         "mlh1.S"           
##  [289] "mlh1"              "mutL"              "eprs1.L"          
##  [292] "eprs"              "eprs1.L"           "eprs"             
##  [295] "eprs1.L"           "eprs"              "LOC121393781"     
##  [298] "LOC121393773"      "eprs1.S"           "eprs"             
##  [301] "eprs.S"            "rps12.S"           "rps12"            
##  [304] "rps12-a"           "rps12-b"           "rps12b"           
##  [307] "eprs1.S"           "eprs"              "eprs.S"           
##  [310] "rps24.L"           "rps24"             "rps24.L"          
##  [313] "LOC121393051"      "rps24"             "LOC121393051"     
##  [316] "rps24"             "rps24.L"           "LOC121393051"     
##  [319] "rps24"             "rps24.L"           "LOC121393051"     
##  [322] "rps15.S"           "rig"               "rps15"            
##  [325] "rps27.L"           "mrpl9.S"           "comp72"           
##  [328] "l9mt"              "mrpl9"             "LOC108700925"     
##  [331] "nsa2.S"            "mrps31.L"          "rps6ka3.S"        
##  [334] "rps6ka3.S"         "rps27l.L"          "mlh1.S"           
##  [337] "LOC100036779"      "mlh1"              "mutL"             
##  [340] "mrps36.S"          "dc47"              "LOC100037103"     
##  [343] "mrp-s36"           "mrps36"            "mrps36.L"         
##  [346] "LOC100037089"      "rpl7a.L"           "rpl7a"            
##  [349] "XB5843130.S"       "LOC100049136"      "XB5843130"        
##  [352] "LOC100101334"      "rps3a-b"           "LOC100126642"     
##  [355] "mrpl4.L"           "cgi-28"            "l4mt"             
##  [358] "mrpl4"             "LOC100127328"      "LOC100037086"     
##  [361] "ffcskk.L"          "fcsk"              "fuk"              
##  [364] "fuk.L"             "LOC100137687"      "eprs1.S"          
##  [367] "eprs"              "eprs.S"            "eprs1"            
##  [370] "LOC100158438"      "ipo7.L"            "imp7"             
##  [373] "ipo7"              "MGC52556"          "ranbp7"           
##  [376] "rps7"              "rps8"              "rps24"            
##  [379] "rpl35a"            "rpl4-a"            "rpl-4"            
##  [382] "rpl1a"             "rpl18-a"           "rpl14a"           
##  [385] "rps15"             "rig"               "rps20"            
##  [388] "rps6"              "rps11"             "rpl8"             
##  [391] "rpl28"             "rpl27a"            "rpl22"            
##  [394] "rps12"             "rps27"             "rps13"            
##  [397] "rps4"              "rps4x"             "rpl21"            
##  [400] "rpl26"             "rpl22"             "mrpl45"           
##  [403] "rps10"             "mrps16.L"          "LOC100158393"     
##  [406] "mrps16"            "mrps9.L"           "mrps9"            
##  [409] "mrpl16.S"          "mrpl16"            "mrpl16.L"         
##  [412] "rplp2.L"           "MGC154377"         "rplp2"            
##  [415] "LOC734179"         "mrpl23.S"          "MGC131313"        
##  [418] "mrpl23"            "mrps24-b"          "mrps18b.S"        
##  [421] "MGC130639"         "mrp-s18-2"         "mrps18-2"         
##  [424] "mrps18b"           "ptd017"            "s18amt"           
##  [427] "SCL75"             "rpl39.S"           "MGC116452"        
##  [430] "MGC116477"         "rpl39"             "rpl39-a"          
##  [433] "rpl39-b"           "rpl39.L"           "rpl39a"           
##  [436] "rpl39b"            "rpl35.L"           "l35"              
##  [439] "MGC116425"         "rpl35"             "fau.L"            
##  [442] "fau"               "MGC116435"         "rps29.S"          
##  [445] "MGC114875"         "rps29"             "rps15a.S"         
##  [448] "MGC114789"         "MGC130892"         "rps15a"           
##  [451] "rps22"             "mrpl52"            "mrps18a.L"        
##  [454] "MGC115435"         "mrp-s18-3"         "mrps18-3"         
##  [457] "mrps18a"           "s18bmt"            "MGC114621"        
##  [460] "MGC98504"          "LOC733189"         "MGC115171"        
##  [463] "mrpl40.S"          "LOC496101"         "mrpl40"           
##  [466] "mrps11.L"          "LOC495996"         "mrps11"           
##  [469] "mrpl9.S"           "comp72"            "l9mt"             
##  [472] "LOC495474"         "mrpl9"             "mrpl20"           
##  [475] "mrpl12.L"          "LOC495364"         "mrpl12"           
##  [478] "mrps33.L"          "LOC495310"         "mrps33"           
##  [481] "mrpl3.S"           "LOC494992"         "mrpl3"            
##  [484] "rpl3l.L"           "LOC494722"         "rpl3l"            
##  [487] "mrpl2.L"           "cgi-22"            "MGC84466"         
##  [490] "mrp-l14"           "mrpl2"             "rpml14"           
##  [493] "rps28p9.L"         "MGC85550"          "rps28p9"          
##  [496] "rps28p9.S"         "rpl36"             "rpl36a.L"         
##  [499] "LOC108700056"      "MGC85428"          "rpl36a"           
##  [502] "rpl36a.S"          "rpl38.L"           "MGC85404"         
##  [505] "rpl38"             "rpl38.S"           "rpl28.S"          
##  [508] "l28"               "LOC100101273"      "rpl28"            
##  [511] "rpl28-a"           "rpl28-b"           "rpl28.L"          
##  [514] "rpl29.S"           "MGC85384"          "rpl29"            
##  [517] "rpl29-a"           "rpl29-b"           "rpl29b"           
##  [520] "rpl23a.S"          "l23a"              "mda20"            
##  [523] "MGC85348"          "rpl23a"            "rpl23a.L"         
##  [526] "rpl11.S"           "MGC85310"          "rpl11"            
##  [529] "rpl11.L"           "mrpl51"            "mrpl28"           
##  [532] "rps21"             "rps26.L"           "MGC86356"         
##  [535] "rps26"             "rps23.S"           "MGC86316"         
##  [538] "rps23"             "rps23.L"           "mrpl15"           
##  [541] "rpl17.S"           "l17"               "MGC78885"         
##  [544] "pd-1"              "rpl17"             "rpl17-a"          
##  [547] "rpl17-b"           "rpl17a"            "rpl23"            
##  [550] "yy1.L"             "FIII"              "ino80s"           
##  [553] "nf-e1"             "ucrbp"             "xyy1"             
##  [556] "yin-yang-1"        "yy1"               "yy1-a"            
##  [559] "yy1-b"             "mrps7"             "rpl6.S"           
##  [562] "MGC84358"          "rpl6"              "rpl19.S"          
##  [565] "rpl19"             "rpl19-prov"        "rps8.L"           
##  [568] "MGC83421"          "rps8"              "mrpl41-a"         
##  [571] "LOC398653"         "rps25.L"           "MGC82151"         
##  [574] "rps25"             "rps25.S"           "rps20.S"          
##  [577] "MGC82136"          "rpl13.S"           "rpl13"            
##  [580] "rpl13-prov"        "mrpl17.S"          "MGC83084"         
##  [583] "mrpl17"            "rps27a.S"          "MGC81889"         
##  [586] "rps27a"            "rps27a.L"          "rpl37.S"          
##  [589] "MGC82973"          "rpl37"             "rpl30.L"          
##  [592] "l30"               "MGC82844"          "rpl30"            
##  [595] "rpl30-a"           "rpl30-b"           "rps17.L"          
##  [598] "MGC82841"          "rps17"             "rps17.S"          
##  [601] "rpl23.S"           "LOC108700787"      "MGC82808"         
##  [604] "rpl23"             "galk1.L"           "galk1"            
##  [607] "MGC82807"          "rps6kb1-A"         "rps9.S"           
##  [610] "MGC80804"          "rps9"              "rps9.L"           
##  [613] "mlh3.L"            "MGC80774"          "mlh3"             
##  [616] "MGC80700"          "rpl7a.S"           "MGC80199"         
##  [619] "rpl7a"             "surf3"             "trup"             
##  [622] "rplp2.S"           "MGC80163"          "uba52.L"          
##  [625] "LOC108706905"      "MGC80109"          "uba52"            
##  [628] "exosc9.L"          "exosc9"            "SCL75"            
##  [631] "scl75"             "mrpl41-b"          "rpl14.S"          
##  [634] "MGC83076"          "rpl14"             "mrps24-a"         
##  [637] "rps18.L"           "LOC108700218"      "MGC82306"         
##  [640] "rps18"             "mrps26.L"          "MGC82245"         
##  [643] "mrp-s13"           "mrp-s26"           "mrps13"           
##  [646] "mrps26"            "rpms13"            "nhp2"             
##  [649] "nola2"             "rps6kc1.L"         "MGC81290"         
##  [652] "rpk118"            "rps6kc1"           "s6pkh1"           
##  [655] "rpl31"             "rplp1.L"           "MGC68562"         
##  [658] "rplp1"             "hsp90b1.S"         "ecgp"             
##  [661] "gp96"              "grp94"             "hsp90b1"          
##  [664] "MGC68448"          "tra1"              "qars1.S"          
##  [667] "MGC69128"          "qars"              "qars.S"           
##  [670] "qars1"             "rps12.S"           "MGC68529"         
##  [673] "rps12"             "rps12-a"           "rps12-b"          
##  [676] "rps12b"            "rpl27.S"           "rpl27"            
##  [679] "rps19.S"           "rps19"             "rps14.S"          
##  [682] "rps14"             "rps14-prov"        "rps14.L"          
##  [685] "rpl4.S"            "rpl4"              "rpl4-b"           
##  [688] "rps8"              "mrpl44.S"          "mrpl44"           
##  [691] "rps11.L"           "LOC108702869"      "rps11"            
##  [694] "rpl28.L"           "l28"               "rpl28"            
##  [697] "rpl28-a"           "rpl28-b"           "rpl28.S"          
##  [700] "rpl18.S"           "MGC64315"          "rpl14a"           
##  [703] "rpl18"             "rpl18-a"           "rpl18-b"          
##  [706] "rpl18.L"           "rpl29.L"           "rpl29"            
##  [709] "rpl29-a"           "rpl29-b"           "rpl29a"           
##  [712] "rpl18.L"           "L14B"              "rpl14b"           
##  [715] "rpl18"             "rpl18-a"           "rpl18-b"          
##  [718] "rpl18.S"           "rpl35a.S"          "LOC121393140"     
##  [721] "rpl35a"            "rpl35a.L"          "rpl27a.L"         
##  [724] "LOC121400351"      "rpl27a"            "rpl21.L"          
##  [727] "rpl21"             "rpl37a"            "LOC398653"        
##  [730] "rpl18a.S"          "MGC64263"          "rpl18a"           
##  [733] "rpl30.S"           "l30"               "rpl30"            
##  [736] "rpl30-a"           "rpl30-b"           "rps13.L"          
##  [739] "rps13"             "mrto4.L"           "mrt4"             
##  [742] "mrto4"             "rps2.L"            "rps2"             
##  [745] "rps2e"             "hsp90b1.L"         "ecgp"             
##  [748] "gp96"              "grp94"             "hsp90b1"          
##  [751] "tra1"              "rpl9.L"            "rpl9"             
##  [754] "rpl15.L"           "rpl15"             "rpl10.S"          
##  [757] "rpl10"             "rpl10.L"           "eef2.1.L"         
##  [760] "eef-2"             "eef2"              "eef2.1"           
##  [763] "ef2"               "LOC398512"         "pms1.S"           
##  [766] "pms1"              "eftud2.S"          "eftud2"           
##  [769] "snrp116"           "snu114"            "rps12.L"          
##  [772] "rps12"             "rps12-a"           "rps12-b"          
##  [775] "rps12a"            "rpl13a.L"          "rpl13a"           
##  [778] "rpl17.L"           "RPL17"             "rpl17"            
##  [781] "ckm.L"             "ckm"               "ckm.S"            
##  [784] "ckmm"              "m-ck"              "rpl3.L"           
##  [787] "rpl3"              "rpl19"             "rpl10a"           
##  [790] "rps6.L"            "rps6"              "rps6-a"           
##  [793] "rps6-b"            "rps6b"             "rps9"             
##  [796] "rps3a-a"           "rplp0.S"           "arbp"             
##  [799] "l10e"              "lp0"               "prlp0"            
##  [802] "rplp0"             "rpp0"              "rpl5.L"           
##  [805] "rpl5"              "rpl5-a"            "rpl5-b"           
##  [808] "rack1.L"           "gnb2-rs1"          "gnb2l1"           
##  [811] "h12.3"             "hlc-7"             "LOC446289"        
##  [814] "pig21"             "rack1"             "rack1.S"          
##  [817] "exosc8.S"          "exosc8"            "exosc8.L"         
##  [820] "MGC52847"          "rpl12.S"           "rpl12"            
##  [823] "rpl5.S"            "rpl5"              "rpl5-a"           
##  [826] "rpl5-b"            "rpl34.S"           "L34"              
##  [829] "rpl34"             "rpl34.L"           "XL34"             
##  [832] "xl34"              "yy1.L"             "FIII"             
##  [835] "ino80s"            "nf-e1"             "ucrbp"            
##  [838] "xyy1"              "yin-yang-1"        "yy1"              
##  [841] "yy1-a"             "yy1-b"             "mrpl10.S"         
##  [844] "mrps7.S"           "rpl13a.S"          "trap1.S"          
##  [847] "galk1.L"           "galk1"             "mrpl28.L"         
##  [850] "mrpl28"            "rsl1d1.L"          "rsl1d1"           
##  [853] "trap1.L"           "mrps10.S"          "mrps12.S"         
##  [856] "mrps21.S"          "rpl10.L"           "rps5.L"           
##  [859] "exosc5.L"          "rps6kl1.L"         "hsp90aa1.1.L"     
##  [862] "LOC108698781"      "mrps21.L"          "LOC108697549"     
##  [865] "LOC108697688"      "mrpl51.L"          "XB5896631.L"      
##  [868] "rps19.L"           "mrpl32.S"          "LOC108695345"     
##  [871] "mrpl55.L"          "mrpl3.L"           "rps6kc1.S"        
##  [874] "mrpl57.S"          "mrpl33.L"          "mrpl14.L"         
##  [877] "mrps5.L"           "fau.S"             "mrpl11.S"         
##  [880] "mrps14.S"          "imp3.S"            "LOC108715766"     
##  [883] "LOC108715857"      "mrps25.S"          "exosc6.L"         
##  [886] "rps19bp1.L"        "LOC108714734"      "mrps33.S"         
##  [889] "mrpl46.S"          "rps16.S"           "mrpl42.S"         
##  [892] "rpl26.S"           "mrps11.L"          "mrps11"           
##  [895] "mrpl46.L"          "rpl37a.L"          "LOC108711439"     
##  [898] "LOC108712865"      "rpl23a.L"          "mrps6.S"          
##  [901] "rpl24.S"           "mrps31.S"          "mrpl48.S"         
##  [904] "mrpl48"            "mrpl39.L"          "LOC108707957"     
##  [907] "rpl11.L"           "rpl24.L"           "rpl24"            
##  [910] "mrps31.L"          "mrpl54.S"          "mrps18c.S"        
##  [913] "mrpl52.S"          "mrpl52"            "mrpl1.L"          
##  [916] "LOC108712230"      "LOC108713416"      "mrpl52.L"         
##  [919] "rpl6.L"            "eef2.2.L"          "MGC68699"         
##  [922] "mrps18c.L"         "mrpl38.L"          "mrpl38"           
##  [925] "mrpl51.L"          "XB5896631.L"       "LOC108698781"     
##  [928] "LOC108698235"      "LOC108700022"      "mrpl1.S"          
##  [931] "LOC108700022"      "rsl1d1.S"          "mrpl27.L"         
##  [934] "rsl1d1.S"          "rpl3l.S"           "mrpl48.S"         
##  [937] "mrpl48"            "rps10.L"           "rps10"            
##  [940] "LOC108707348"      "eef2.1.S"          "LOC108708365"     
##  [943] "mrpl42.L"          "mrpl48.S"          "mrpl48"           
##  [946] "galk2.L"           "galk2"             "rps14.L"          
##  [949] "mrpl42.L"          "rpl18a.L"          "galk2.L"          
##  [952] "galk2"             "mrpl1.L"           "LOC108710319"     
##  [955] "mrps11.S"          "mrpl52.L"          "mrps25.S"         
##  [958] "rpl7l1.L"          "rpl7l1"            "mrps27.L"         
##  [961] "mrpl47.L"          "mrpl47.L"          "mrpl3.S"          
##  [964] "mrpl3"             "mrpl57.S"          "mrpl19.S"         
##  [967] "mrpl3.L"           "LOC108715263"      "LOC108716102"     
##  [970] "LOC108716102"      "mrps22.L"          "mrps22.L"         
##  [973] "LOC108717404"      "LOC108705784"      "LOC108717492"     
##  [976] "LOC121394503"      "mrpl51.L"          "LOC121395362"     
##  [979] "LOC121396125"      "LOC108705858"      "LOC121396648"     
##  [982] "rpl36a.L"          "rpl36a"            "rpl36a.S"         
##  [985] "LOC121396642"      "exosc5.L"          "dap3.L"           
##  [988] "dap3.L"            "LOC108700056"      "rsl1d1.L"         
##  [991] "rsl1d1"            "rsl1d1.L"          "rsl1d1"           
##  [994] "mrpl45.L"          "mrlp45"            "mrpl45"           
##  [997] "LOC108706079"      "mrps18c.S"         "LOC121400386"     
## [1000] "LOC121400621"      "LOC121400576"      "LOC121400621"     
## [1003] "LOC108709120"      "mrps11.L"          "mrps11"           
## [1006] "mrpl22.L"          "LOC108712266"      "LOC108712865"     
## [1009] "LOC108712266"      "LOC108712865"      "LOC108712883"     
## [1012] "LOC108712266"      "rps13.L"           "rps13"            
## [1015] "LOC121402816"      "LOC121402815"      "LOC121393045"     
## [1018] "mrpl1.L"           "rpl22l1.L"         "LOC100037062"     
## [1021] "rpl22l1"           "LOC100037111"      "mrpl37.L"         
## [1024] "LOC733390"         "mrpl37"            "LOC100037086"     
## [1027] "LOC100037184"      "PDCD9"             "LOC100049095"     
## [1030] "mrps2.L"           "LOC443704"         "mrps2"            
## [1033] "mrps25.L"          "LOC100049746"      "mrps25"           
## [1036] "LOC100126614"      "mrpl32.L"          "LOC100126616"     
## [1039] "mrpl32"            "LOC100127333"      "mrps35.L"         
## [1042] "LOC100127340"      "mrps35"            "PDCD9"            
## [1045] "rsl1d1.L"          "LOC100137646"      "rsl1d1"           
## [1048] "rps10.L"           "LOC445824"         "rps10"            
## [1051] "mrps2.S"           "LOC733351"         "LOC733390"        
## [1054] "mrps30.L"          "MGC131350"         "mrps30"           
## [1057] "LOC733400"         "LOC733390"         "rpl7.L"           
## [1060] "MGC130910"         "rpl7"              "LOC733385"        
## [1063] "mrpl21.L"          "MGC131341"         "mrpl21"           
## [1066] "LOC733351"         "rps16"             "LOC446962"        
## [1069] "rpl34"             "LOC733302"         "LOC734162"        
## [1072] "mrpl11.L"          "LOC496259"         "mrpl11"           
## [1075] "mrpl13.S"          "LOC496258"         "mrpl13"           
## [1078] "exosc8.L"          "LOC496046"         "exosc4.S"         
## [1081] "exosc4"            "LOC495942"         "LOC495666"        
## [1084] "rpl7l1.L"          "LOC495349"         "rpl7l1"           
## [1087] "mrpl48.S"          "LOC495212"         "mrpl48"           
## [1090] "LOC446962"         "mrpl53.L"          "MGC85354"         
## [1093] "mrpl53"            "rpl32.L"           "MGC85374"         
## [1096] "rpl32"             "rpl24.L"           "MGC85232"         
## [1099] "rpl24"             "MGC84749"          "eft-2-prov"       
## [1102] "exosc7.S"          "exosc7"            "exosc7-prov"      
## [1105] "mrpl1.S"           "mrpl1"             "mrpl1-prov"       
## [1108] "rpl26.L"           "LOC443704"         "LOC445824"        
## [1111] "rsl24d1.L"         "MGC81028"          "rlp24"            
## [1114] "rpl24"             "rpl24l"            "rsl24d1"          
## [1117] "rvas3"             "mrpl11.S"          "MGC82344"         
## [1120] "mrpl11"            "imp3.L"            "imp3"             
## [1123] "MGC81216"          "rps16.L"           "MGC80065"         
## [1126] "rps16"             "rps5.S"            "rps5"             
## [1129] "mrps17.L"          "mrps17"            "mrps12.L"         
## [1132] "mrps12"            "rps10.S"           "rps10"            
## [1135] "LOC398682"         "mrpl18.L"          "mrpl18"           
## [1138] "mrps34.S"          "mrps34"            "galk2.L"          
## [1141] "galk2"             "mrpl43.L"          "mrpl43"           
## [1144] "rsl24d1.S"         "rlp24"             "rpl24"            
## [1147] "rpl24l"            "rsl24d1"           "rvas3"            
## [1150] "RPL18A"            "mrps30.S"          "mrps30"           
## [1153] "PDCD9"             "pdcd9"             "S14"              
## [1156] "RACK1"
provisional %>% length()
## [1] 1156
intersect( rownames(xenopus), provisional ) %>% length()
## [1] 309
# intersect( gene.list.frog, provisional )
# Potential genes that might be missed
potential <- setdiff( provisional, rownames(xenopus) )
potential
##   [1] "37lrp"             "lambr"             "lamr1"            
##   [4] "LR"                "lrp"               "p40"              
##   [7] "RPSA"              "rpsa"              "37LRP"            
##  [10] "67LR"              "LamR"              "LBP/p40"          
##  [13] "LRP/LR"            "LOC100037080"      "rps3-a"           
##  [16] "rps6ka"            "rps3-b"            "ogfod1"           
##  [19] "impact"            "pp90rsk4"          "rps6ka1"          
##  [22] "rps6ka6"           "rsk4"              "LONP1"            
##  [25] "hu-1"              "mapkapk1a"         "rsk"              
##  [28] "rsk1"              "rps6ka4"           "gfm1"             
##  [31] "efg1"              "rpl4-b"            "rpl1b"            
##  [34] "rpl18-b"           "rpl14b"            "rpl5-a"           
##  [37] "rpl5-b"            "lonp2"             "ptcd3"            
##  [40] "snu13"             "nhp2l1"            "hsp90ab1"         
##  [43] "hsp90beta"         "XELAEV_18028538mg" "mina"             
##  [46] "mina-prov"         "mina.L"            "NO52"             
##  [49] "riox2"             "cdk105"            "hcl-g1"           
##  [52] "hclg1"             "hussy-29"          "hussy29"          
##  [55] "nsa2"              "tinp1"             "yr-29"            
##  [58] "mrpl24"            "MGC81220"          "LOC398512"        
##  [61] "top2"              "top2a"             "tp2a"             
##  [64] "fa-1"              "fa1"               "hoip"             
##  [67] "nhp2l1-b"          "nhpx"              "otk27"            
##  [70] "snrnp15-5"         "spag12"            "ssfa1"            
##  [73] "rpls3-b"           "p90"               "rps6ka3"          
##  [76] "RSK2"              "rsk2"              "S6KII"            
##  [79] "p70-alpha"         "p70-s6k"           "p70s6k"           
##  [82] "ps6k"              "rps6kb1"           "rps6kb1-A"        
##  [85] "s6k"               "s6K1"              "stk14a"           
##  [88] "pms1"              "gnb2-rs1"          "gnb2l1"           
##  [91] "h12.3"             "hlc-7"             "pig21"            
##  [94] "mlh3"              "L24"               "rpl24"            
##  [97] "rpl24-A"           "rps20"             "dba8"             
## [100] "rps7"              "rpS8A"             "rpS8B"            
## [103] "eprs"              "nhp2.L"            "rig"              
## [106] "rps15"             "L34"               "rpl34"            
## [109] "xl34"              "EFG2"              "GFM2"             
## [112] "b"                 "l17"               "pd-1"             
## [115] "rpl17"             "rpl17-a"           "rpl17-b"          
## [118] "rpl23"             "rps19"             "rpl11"            
## [121] "l35"               "rpl35"             "rpl6"             
## [124] "rps15a"            "rps22"             "rpk118"           
## [127] "rps6kc1"           "s6pkh1"            "mrp-s18-3"        
## [130] "mrps18-3"          "mrps18a"           "s18bmt"           
## [133] "LOC108717877"      "hsp90b"            "mlh1"             
## [136] "mutL"              "eprs.S"            "rps12"            
## [139] "rps12-a"           "rps12-b"           "rps12b"           
## [142] "rps24"             "comp72"            "l9mt"             
## [145] "mrpl9"             "LOC100036779"      "dc47"             
## [148] "LOC100037103"      "mrp-s36"           "mrps36"           
## [151] "mrps36.L"          "LOC100037089"      "rpl7a"            
## [154] "LOC100049136"      "XB5843130"         "LOC100101334"     
## [157] "rps3a-b"           "LOC100126642"      "cgi-28"           
## [160] "l4mt"              "mrpl4"             "LOC100127328"     
## [163] "LOC100037086"      "fcsk"              "fuk"              
## [166] "fuk.L"             "LOC100137687"      "eprs1"            
## [169] "LOC100158438"      "imp7"              "ipo7"             
## [172] "MGC52556"          "ranbp7"            "rps8"             
## [175] "rpl35a"            "rpl4-a"            "rpl-4"            
## [178] "rpl1a"             "rpl18-a"           "rpl14a"           
## [181] "rps6"              "rps11"             "rpl8"             
## [184] "rpl28"             "rpl27a"            "rpl22"            
## [187] "rps27"             "rps13"             "rps4"             
## [190] "rps4x"             "rpl21"             "rpl26"            
## [193] "mrpl45"            "rps10"             "LOC100158393"     
## [196] "mrps16"            "mrps9"             "mrpl16"           
## [199] "mrpl16.L"          "MGC154377"         "rplp2"            
## [202] "LOC734179"         "MGC131313"         "mrpl23"           
## [205] "mrps24-b"          "MGC130639"         "mrp-s18-2"        
## [208] "mrps18-2"          "mrps18b"           "ptd017"           
## [211] "s18amt"            "SCL75"             "MGC116452"        
## [214] "MGC116477"         "rpl39"             "rpl39-a"          
## [217] "rpl39-b"           "rpl39a"            "rpl39b"           
## [220] "MGC116425"         "fau"               "MGC116435"        
## [223] "MGC114875"         "rps29"             "MGC130892"        
## [226] "mrpl52"            "MGC115435"         "MGC98504"         
## [229] "LOC733189"         "MGC115171"         "LOC496101"        
## [232] "mrpl40"            "LOC495996"         "mrps11"           
## [235] "LOC495474"         "mrpl20"            "LOC495364"        
## [238] "mrpl12"            "LOC495310"         "mrps33"           
## [241] "LOC494992"         "mrpl3"             "rpl3l.L"          
## [244] "LOC494722"         "rpl3l"             "cgi-22"           
## [247] "MGC84466"          "mrp-l14"           "mrpl2"            
## [250] "rpml14"            "MGC85550"          "rps28p9"          
## [253] "rpl36"             "MGC85428"          "rpl36a"           
## [256] "rpl36a.S"          "MGC85404"          "rpl38"            
## [259] "l28"               "LOC100101273"      "rpl28-a"          
## [262] "rpl28-b"           "MGC85384"          "rpl29"            
## [265] "rpl29-a"           "rpl29-b"           "rpl29b"           
## [268] "l23a"              "mda20"             "MGC85348"         
## [271] "rpl23a"            "MGC85310"          "mrpl51"           
## [274] "mrpl28"            "rps21"             "MGC86356"         
## [277] "rps26"             "MGC86316"          "rps23"            
## [280] "mrpl15"            "MGC78885"          "rpl17a"           
## [283] "FIII"              "ino80s"            "nf-e1"            
## [286] "ucrbp"             "xyy1"              "yin-yang-1"       
## [289] "yy1"               "yy1-a"             "yy1-b"            
## [292] "mrps7"             "MGC84358"          "rpl19.S"          
## [295] "rpl19"             "rpl19-prov"        "MGC83421"         
## [298] "mrpl41-a"          "MGC82151"          "rps25"            
## [301] "MGC82136"          "rpl13"             "rpl13-prov"       
## [304] "MGC83084"          "mrpl17"            "MGC81889"         
## [307] "rps27a"            "MGC82973"          "rpl37"            
## [310] "l30"               "MGC82844"          "rpl30"            
## [313] "rpl30-a"           "rpl30-b"           "MGC82841"         
## [316] "rps17"             "MGC82808"          "galk1"            
## [319] "MGC82807"          "MGC80804"          "MGC80774"         
## [322] "MGC80199"          "surf3"             "trup"             
## [325] "MGC80163"          "MGC80109"          "uba52"            
## [328] "exosc9"            "scl75"             "mrpl41-b"         
## [331] "MGC83076"          "rpl14"             "mrps24-a"         
## [334] "MGC82306"          "rps18"             "MGC82245"         
## [337] "mrp-s13"           "mrp-s26"           "mrps13"           
## [340] "mrps26"            "rpms13"            "nhp2"             
## [343] "nola2"             "MGC81290"          "rpl31"            
## [346] "MGC68562"          "rplp1"             "ecgp"             
## [349] "gp96"              "grp94"             "hsp90b1"          
## [352] "MGC68448"          "tra1"              "MGC69128"         
## [355] "qars"              "qars.S"            "qars1"            
## [358] "MGC68529"          "rpl27"             "rps14"            
## [361] "rps14-prov"        "rpl4"              "mrpl44"           
## [364] "MGC64315"          "rpl18"             "rpl29a"           
## [367] "L14B"              "rpl35a.L"          "rpl37a"           
## [370] "MGC64263"          "rpl18a"            "mrt4"             
## [373] "mrto4"             "rps2"              "rps2e"            
## [376] "rpl9"              "rpl15"             "rpl10"            
## [379] "eef-2"             "eef2"              "eef2.1"           
## [382] "ef2"               "eftud2"            "snrp116"          
## [385] "snu114"            "rps12a"            "rpl13a"           
## [388] "RPL17"             "ckm"               "ckm.S"            
## [391] "ckmm"              "m-ck"              "rpl3"             
## [394] "rpl10a"            "rps6-a"            "rps6-b"           
## [397] "rps6b"             "rps3a-a"           "arbp"             
## [400] "l10e"              "lp0"               "prlp0"            
## [403] "rplp0"             "rpp0"              "rpl5"             
## [406] "LOC446289"         "rack1"             "exosc8"           
## [409] "MGC52847"          "rpl12"             "XL34"             
## [412] "rsl1d1"            "rps6kl1.L"         "hsp90aa1.1.L"     
## [415] "mrps14.S"          "rpl37a.L"          "mrpl48"           
## [418] "LOC108707957"      "eef2.2.L"          "MGC68699"         
## [421] "mrpl38"            "LOC108698235"      "rpl3l.S"          
## [424] "LOC108707348"      "galk2"             "LOC108710319"     
## [427] "rpl7l1"            "LOC108715263"      "LOC121395362"     
## [430] "LOC121396125"      "LOC108705858"      "LOC121396648"     
## [433] "LOC121396642"      "mrlp45"            "LOC108706079"     
## [436] "LOC121400386"      "LOC121400621"      "LOC121400576"     
## [439] "LOC108709120"      "LOC108712266"      "LOC121402816"     
## [442] "LOC121402815"      "LOC100037062"      "rpl22l1"          
## [445] "LOC100037111"      "LOC733390"         "mrpl37"           
## [448] "LOC100037184"      "PDCD9"             "LOC100049095"     
## [451] "LOC443704"         "mrps2"             "LOC100049746"     
## [454] "mrps25"            "LOC100126614"      "LOC100126616"     
## [457] "mrpl32"            "LOC100127333"      "LOC100127340"     
## [460] "mrps35"            "LOC100137646"      "LOC445824"        
## [463] "LOC733351"         "MGC131350"         "mrps30"           
## [466] "LOC733400"         "MGC130910"         "rpl7"             
## [469] "LOC733385"         "MGC131341"         "mrpl21"           
## [472] "rps16"             "LOC446962"         "LOC733302"        
## [475] "LOC734162"         "LOC496259"         "mrpl11"           
## [478] "LOC496258"         "mrpl13"            "LOC496046"        
## [481] "exosc4"            "LOC495942"         "LOC495666"        
## [484] "LOC495349"         "LOC495212"         "MGC85354"         
## [487] "mrpl53"            "MGC85374"          "rpl32"            
## [490] "MGC85232"          "MGC84749"          "eft-2-prov"       
## [493] "exosc7"            "exosc7-prov"       "mrpl1"            
## [496] "mrpl1-prov"        "MGC81028"          "rlp24"            
## [499] "rpl24l"            "rsl24d1"           "rvas3"            
## [502] "MGC82344"          "imp3"              "MGC81216"         
## [505] "MGC80065"          "rps5"              "mrps17"           
## [508] "mrps12"            "LOC398682"         "mrpl18"           
## [511] "mrps34"            "mrpl43"            "RPL18A"           
## [514] "pdcd9"             "S14"               "RACK1"
provisional2 <- intersect( c( paste0( potential, ".L" ), paste0( potential, ".S" )), rownames(xenopus))

provisional2
##   [1] "rpsa.L"      "ogfod1.L"    "impact.L"    "rps6ka1.L"   "rps6ka4.L"  
##   [6] "gfm1.L"      "ptcd3.L"     "snu13.L"     "riox2.L"     "nsa2.L"     
##  [11] "mrpl24.L"    "top2a.L"     "rps6ka3.L"   "rps6kb1.L"   "mlh3.L"     
##  [16] "rpl24.L"     "rps20.L"     "rps7.L"      "rps15.L"     "rpl34.L"    
##  [21] "rpl17.L"     "rps19.L"     "rpl11.L"     "rpl35.L"     "rpl6.L"     
##  [26] "rps6kc1.L"   "mrps18a.L"   "rps12.L"     "rps24.L"     "rpl7a.L"    
##  [31] "mrpl4.L"     "eprs1.L"     "ipo7.L"      "rps8.L"      "rps6.L"     
##  [36] "rps11.L"     "rpl8.L"      "rpl28.L"     "rpl27a.L"    "rpl22.L"    
##  [41] "rps27.L"     "rps13.L"     "rps4x.L"     "rpl21.L"     "rpl26.L"    
##  [46] "mrpl45.L"    "rps10.L"     "mrps16.L"    "mrps9.L"     "rplp2.L"    
##  [51] "mrps18b.L"   "rpl39.L"     "fau.L"       "mrpl52.L"    "mrps11.L"   
##  [56] "mrpl20.L"    "mrpl12.L"    "mrps33.L"    "mrpl3.L"     "mrpl2.L"    
##  [61] "rps28p9.L"   "rpl36.L"     "rpl36a.L"    "rpl38.L"     "rpl29.L"    
##  [66] "rpl23a.L"    "mrpl51.L"    "mrpl28.L"    "rps26.L"     "rps23.L"    
##  [71] "mrpl15.L"    "yy1.L"       "mrps7.L"     "rpl19.L"     "rps25.L"    
##  [76] "rps27a.L"    "rpl37.L"     "rpl30.L"     "rps17.L"     "galk1.L"    
##  [81] "uba52.L"     "exosc9.L"    "rps18.L"     "mrps26.L"    "rpl31.L"    
##  [86] "rplp1.L"     "hsp90b1.L"   "qars1.L"     "rpl27.L"     "rps14.L"    
##  [91] "rpl4.L"      "rpl18.L"     "rpl18a.L"    "mrto4.L"     "rps2.L"     
##  [96] "rpl9.L"      "rpl15.L"     "rpl10.L"     "eef2.1.L"    "eftud2.L"   
## [101] "rpl13a.L"    "ckm.L"       "rpl3.L"      "rplp0.L"     "rpl5.L"     
## [106] "rack1.L"     "exosc8.L"    "rpl12.L"     "rsl1d1.L"    "mrpl38.L"   
## [111] "galk2.L"     "rpl7l1.L"    "rpl22l1.L"   "mrpl37.L"    "mrps2.L"    
## [116] "mrps25.L"    "mrpl32.L"    "mrps35.L"    "mrps30.L"    "rpl7.L"     
## [121] "mrpl21.L"    "rps16.L"     "mrpl11.L"    "mrpl53.L"    "rpl32.L"    
## [126] "mrpl1.L"     "rsl24d1.L"   "imp3.L"      "rps5.L"      "mrps17.L"   
## [131] "mrps12.L"    "mrpl18.L"    "mrpl43.L"    "rpsa.S"      "rps6ka6.S"  
## [136] "snu13.S"     "hsp90ab1.S"  "nsa2.S"      "rps6ka3.S"   "rps6kb1.S"  
## [141] "pms1.S"      "rpl24.S"     "rps20.S"     "rps7.S"      "rps15.S"    
## [146] "rpl34.S"     "rpl17.S"     "rpl23.S"     "rps19.S"     "rpl11.S"    
## [151] "rpl6.S"      "rps15a.S"    "rps6kc1.S"   "mlh1.S"      "rps12.S"    
## [156] "mrpl9.S"     "mrps36.S"    "rpl7a.S"     "XB5843130.S" "eprs1.S"    
## [161] "ipo7.S"      "rps8.S"      "rpl35a.S"    "rps6.S"      "rpl8.S"     
## [166] "rpl28.S"     "rpl27a.S"    "rpl22.S"     "rps27.S"     "rps13.S"    
## [171] "rps4x.S"     "rpl26.S"     "rps10.S"     "mrpl16.S"    "rplp2.S"    
## [176] "mrpl23.S"    "mrps18b.S"   "rpl39.S"     "fau.S"       "rps29.S"    
## [181] "mrpl52.S"    "mrpl40.S"    "mrps11.S"    "mrps33.S"    "mrpl3.S"    
## [186] "rps28p9.S"   "rpl36.S"     "rpl38.S"     "rpl29.S"     "rpl23a.S"   
## [191] "rps21.S"     "rps26.S"     "rps23.S"     "yy1.S"       "mrps7.S"    
## [196] "rps25.S"     "rpl13.S"     "mrpl17.S"    "rps27a.S"    "rpl37.S"    
## [201] "rpl30.S"     "rps17.S"     "rpl14.S"     "nhp2.S"      "rpl31.S"    
## [206] "hsp90b1.S"   "qars1.S"     "rpl27.S"     "rps14.S"     "rpl4.S"     
## [211] "mrpl44.S"    "rpl18.S"     "rpl18a.S"    "mrto4.S"     "rpl15.S"    
## [216] "rpl10.S"     "eef2.1.S"    "eftud2.S"    "rpl13a.S"    "rpl10a.S"   
## [221] "rplp0.S"     "rpl5.S"      "rack1.S"     "exosc8.S"    "rpl12.S"    
## [226] "rsl1d1.S"    "mrpl48.S"    "mrps2.S"     "mrps25.S"    "mrpl32.S"   
## [231] "mrps30.S"    "rps16.S"     "mrpl11.S"    "mrpl13.S"    "exosc4.S"   
## [236] "exosc7.S"    "mrpl1.S"     "rsl24d1.S"   "imp3.S"      "rps5.S"     
## [241] "mrps12.S"    "mrps34.S"
potential2 <- setdiff( provisional, provisional2 )
potential2
##   [1] "37lrp"             "lambr"             "lamr1"            
##   [4] "LR"                "lrp"               "p40"              
##   [7] "RPSA"              "rpsa"              "37LRP"            
##  [10] "67LR"              "LamR"              "LBP/p40"          
##  [13] "LRP/LR"            "LOC100037080"      "rps3-a"           
##  [16] "rps6ka"            "rps3-b"            "ogfod1"           
##  [19] "impact"            "top2b.S"           "top2b.L"          
##  [22] "pp90rsk4"          "rps6ka1"           "rps6ka6"          
##  [25] "rsk4"              "lonp1.L"           "LONP1"            
##  [28] "hu-1"              "mapkapk1a"         "rsk"              
##  [31] "rsk1"              "rps6ka4"           "gfm1"             
##  [34] "efg1"              "rpl4-b"            "rpl1b"            
##  [37] "rpl18-b"           "rpl14b"            "rpl5-a"           
##  [40] "rpl5-b"            "lonp2"             "ptcd3"            
##  [43] "snu13"             "nhp2l1"            "hsp90ab1"         
##  [46] "hsp90beta"         "XELAEV_18028538mg" "mina"             
##  [49] "mina-prov"         "mina.L"            "NO52"             
##  [52] "riox2"             "cdk105"            "hcl-g1"           
##  [55] "hclg1"             "hussy-29"          "hussy29"          
##  [58] "nsa2"              "tinp1"             "yr-29"            
##  [61] "mrpl24"            "MGC81220"          "LOC398512"        
##  [64] "top2"              "top2a"             "tp2a"             
##  [67] "fa-1"              "fa1"               "hoip"             
##  [70] "nhp2l1-b"          "nhpx"              "otk27"            
##  [73] "snrnp15-5"         "spag12"            "ssfa1"            
##  [76] "rpls3-b"           "p90"               "rps6ka3"          
##  [79] "RSK2"              "rsk2"              "S6KII"            
##  [82] "p70-alpha"         "p70-s6k"           "p70s6k"           
##  [85] "ps6k"              "rps6kb1"           "rps6kb1-A"        
##  [88] "s6k"               "s6K1"              "stk14a"           
##  [91] "pms1"              "LOC108703225"      "LOC108700925"     
##  [94] "LOC108700150"      "gnb2-rs1"          "gnb2l1"           
##  [97] "h12.3"             "hlc-7"             "pig21"            
## [100] "mlh3"              "rps6ka5.L"         "L24"              
## [103] "rpl24"             "rpl24-A"           "rps9.L"           
## [106] "LOC121394610"      "rps20"             "pnpt1.S"          
## [109] "LOC108717861"      "dba8"              "rps7"             
## [112] "rpS8A"             "rpS8B"             "eprs"             
## [115] "mrpl49.L"          "mvd.L"             "nhp2.L"           
## [118] "LOC108710993"      "rps27l.L"          "mrpl30.S"         
## [121] "mrps15.L"          "rig"               "rps15"            
## [124] "mrpl35.L"          "L34"               "rpl34"            
## [127] "xl34"              "mvk.L"             "gfm2.L"           
## [130] "EFG2"              "GFM2"              "b"                
## [133] "l17"               "pd-1"              "rpl17"            
## [136] "rpl17-a"           "rpl17-b"           "rpl23"            
## [139] "rps19"             "rpl11"             "l35"              
## [142] "rpl35"             "LOC108698757"      "LOC108698451"     
## [145] "MGC114789"         "LOC108703484"      "pms2.L"           
## [148] "rpl6"              "rps15a"            "rps22"            
## [151] "LOC108706570"      "mrps31.S"          "rps27l.S"         
## [154] "rpk118"            "rps6kc1"           "s6pkh1"           
## [157] "mrp-s18-3"         "mrps18-3"          "mrps18a"          
## [160] "s18bmt"            "LOC108717877"      "hsp90b"           
## [163] "mlh1"              "mutL"              "LOC121393781"     
## [166] "LOC121393773"      "eprs.S"            "rps12"            
## [169] "rps12-a"           "rps12-b"           "rps12b"           
## [172] "rps24"             "LOC121393051"      "comp72"           
## [175] "l9mt"              "mrpl9"             "mrps31.L"         
## [178] "LOC100036779"      "dc47"              "LOC100037103"     
## [181] "mrp-s36"           "mrps36"            "mrps36.L"         
## [184] "LOC100037089"      "rpl7a"             "LOC100049136"     
## [187] "XB5843130"         "LOC100101334"      "rps3a-b"          
## [190] "LOC100126642"      "cgi-28"            "l4mt"             
## [193] "mrpl4"             "LOC100127328"      "LOC100037086"     
## [196] "ffcskk.L"          "fcsk"              "fuk"              
## [199] "fuk.L"             "LOC100137687"      "eprs1"            
## [202] "LOC100158438"      "imp7"              "ipo7"             
## [205] "MGC52556"          "ranbp7"            "rps8"             
## [208] "rpl35a"            "rpl4-a"            "rpl-4"            
## [211] "rpl1a"             "rpl18-a"           "rpl14a"           
## [214] "rps6"              "rps11"             "rpl8"             
## [217] "rpl28"             "rpl27a"            "rpl22"            
## [220] "rps27"             "rps13"             "rps4"             
## [223] "rps4x"             "rpl21"             "rpl26"            
## [226] "mrpl45"            "rps10"             "LOC100158393"     
## [229] "mrps16"            "mrps9"             "mrpl16"           
## [232] "mrpl16.L"          "MGC154377"         "rplp2"            
## [235] "LOC734179"         "MGC131313"         "mrpl23"           
## [238] "mrps24-b"          "MGC130639"         "mrp-s18-2"        
## [241] "mrps18-2"          "mrps18b"           "ptd017"           
## [244] "s18amt"            "SCL75"             "MGC116452"        
## [247] "MGC116477"         "rpl39"             "rpl39-a"          
## [250] "rpl39-b"           "rpl39a"            "rpl39b"           
## [253] "MGC116425"         "fau"               "MGC116435"        
## [256] "MGC114875"         "rps29"             "MGC130892"        
## [259] "mrpl52"            "MGC115435"         "MGC114621"        
## [262] "MGC98504"          "LOC733189"         "MGC115171"        
## [265] "LOC496101"         "mrpl40"            "LOC495996"        
## [268] "mrps11"            "LOC495474"         "mrpl20"           
## [271] "LOC495364"         "mrpl12"            "LOC495310"        
## [274] "mrps33"            "LOC494992"         "mrpl3"            
## [277] "rpl3l.L"           "LOC494722"         "rpl3l"            
## [280] "cgi-22"            "MGC84466"          "mrp-l14"          
## [283] "mrpl2"             "rpml14"            "MGC85550"         
## [286] "rps28p9"           "rpl36"             "LOC108700056"     
## [289] "MGC85428"          "rpl36a"            "rpl36a.S"         
## [292] "MGC85404"          "rpl38"             "l28"              
## [295] "LOC100101273"      "rpl28-a"           "rpl28-b"          
## [298] "MGC85384"          "rpl29"             "rpl29-a"          
## [301] "rpl29-b"           "rpl29b"            "l23a"             
## [304] "mda20"             "MGC85348"          "rpl23a"           
## [307] "MGC85310"          "mrpl51"            "mrpl28"           
## [310] "rps21"             "MGC86356"          "rps26"            
## [313] "MGC86316"          "rps23"             "mrpl15"           
## [316] "MGC78885"          "rpl17a"            "FIII"             
## [319] "ino80s"            "nf-e1"             "ucrbp"            
## [322] "xyy1"              "yin-yang-1"        "yy1"              
## [325] "yy1-a"             "yy1-b"             "mrps7"            
## [328] "MGC84358"          "rpl19.S"           "rpl19"            
## [331] "rpl19-prov"        "MGC83421"          "mrpl41-a"         
## [334] "LOC398653"         "MGC82151"          "rps25"            
## [337] "MGC82136"          "rpl13"             "rpl13-prov"       
## [340] "MGC83084"          "mrpl17"            "MGC81889"         
## [343] "rps27a"            "MGC82973"          "rpl37"            
## [346] "l30"               "MGC82844"          "rpl30"            
## [349] "rpl30-a"           "rpl30-b"           "MGC82841"         
## [352] "rps17"             "LOC108700787"      "MGC82808"         
## [355] "galk1"             "MGC82807"          "rps9.S"           
## [358] "MGC80804"          "rps9"              "MGC80774"         
## [361] "MGC80700"          "MGC80199"          "surf3"            
## [364] "trup"              "MGC80163"          "LOC108706905"     
## [367] "MGC80109"          "uba52"             "exosc9"           
## [370] "scl75"             "mrpl41-b"          "MGC83076"         
## [373] "rpl14"             "mrps24-a"          "LOC108700218"     
## [376] "MGC82306"          "rps18"             "MGC82245"         
## [379] "mrp-s13"           "mrp-s26"           "mrps13"           
## [382] "mrps26"            "rpms13"            "nhp2"             
## [385] "nola2"             "MGC81290"          "rpl31"            
## [388] "MGC68562"          "rplp1"             "ecgp"             
## [391] "gp96"              "grp94"             "hsp90b1"          
## [394] "MGC68448"          "tra1"              "MGC69128"         
## [397] "qars"              "qars.S"            "qars1"            
## [400] "MGC68529"          "rpl27"             "rps14"            
## [403] "rps14-prov"        "rpl4"              "mrpl44"           
## [406] "LOC108702869"      "MGC64315"          "rpl18"            
## [409] "rpl29a"            "L14B"              "LOC121393140"     
## [412] "rpl35a.L"          "LOC121400351"      "rpl37a"           
## [415] "MGC64263"          "rpl18a"            "mrt4"             
## [418] "mrto4"             "rps2"              "rps2e"            
## [421] "rpl9"              "rpl15"             "rpl10"            
## [424] "eef-2"             "eef2"              "eef2.1"           
## [427] "ef2"               "eftud2"            "snrp116"          
## [430] "snu114"            "rps12a"            "rpl13a"           
## [433] "RPL17"             "ckm"               "ckm.S"            
## [436] "ckmm"              "m-ck"              "rpl3"             
## [439] "rpl10a"            "rps6-a"            "rps6-b"           
## [442] "rps6b"             "rps3a-a"           "arbp"             
## [445] "l10e"              "lp0"               "prlp0"            
## [448] "rplp0"             "rpp0"              "rpl5"             
## [451] "LOC446289"         "rack1"             "exosc8"           
## [454] "MGC52847"          "rpl12"             "XL34"             
## [457] "mrpl10.S"          "trap1.S"           "rsl1d1"           
## [460] "trap1.L"           "mrps10.S"          "mrps21.S"         
## [463] "exosc5.L"          "rps6kl1.L"         "hsp90aa1.1.L"     
## [466] "LOC108698781"      "mrps21.L"          "LOC108697549"     
## [469] "LOC108697688"      "XB5896631.L"       "LOC108695345"     
## [472] "mrpl55.L"          "mrpl57.S"          "mrpl33.L"         
## [475] "mrpl14.L"          "mrps5.L"           "mrps14.S"         
## [478] "LOC108715766"      "LOC108715857"      "exosc6.L"         
## [481] "rps19bp1.L"        "LOC108714734"      "mrpl46.S"         
## [484] "mrpl42.S"          "mrpl46.L"          "rpl37a.L"         
## [487] "LOC108711439"      "LOC108712865"      "mrps6.S"          
## [490] "mrpl48"            "mrpl39.L"          "LOC108707957"     
## [493] "mrpl54.S"          "mrps18c.S"         "LOC108712230"     
## [496] "LOC108713416"      "eef2.2.L"          "MGC68699"         
## [499] "mrps18c.L"         "mrpl38"            "LOC108698235"     
## [502] "LOC108700022"      "mrpl27.L"          "rpl3l.S"          
## [505] "LOC108707348"      "LOC108708365"      "mrpl42.L"         
## [508] "galk2"             "LOC108710319"      "rpl7l1"           
## [511] "mrps27.L"          "mrpl47.L"          "mrpl19.S"         
## [514] "LOC108715263"      "LOC108716102"      "mrps22.L"         
## [517] "LOC108717404"      "LOC108705784"      "LOC108717492"     
## [520] "LOC121394503"      "LOC121395362"      "LOC121396125"     
## [523] "LOC108705858"      "LOC121396648"      "LOC121396642"     
## [526] "dap3.L"            "mrlp45"            "LOC108706079"     
## [529] "LOC121400386"      "LOC121400621"      "LOC121400576"     
## [532] "LOC108709120"      "mrpl22.L"          "LOC108712266"     
## [535] "LOC108712883"      "LOC121402816"      "LOC121402815"     
## [538] "LOC121393045"      "LOC100037062"      "rpl22l1"          
## [541] "LOC100037111"      "LOC733390"         "mrpl37"           
## [544] "LOC100037184"      "PDCD9"             "LOC100049095"     
## [547] "LOC443704"         "mrps2"             "LOC100049746"     
## [550] "mrps25"            "LOC100126614"      "LOC100126616"     
## [553] "mrpl32"            "LOC100127333"      "LOC100127340"     
## [556] "mrps35"            "LOC100137646"      "LOC445824"        
## [559] "LOC733351"         "MGC131350"         "mrps30"           
## [562] "LOC733400"         "MGC130910"         "rpl7"             
## [565] "LOC733385"         "MGC131341"         "mrpl21"           
## [568] "rps16"             "LOC446962"         "LOC733302"        
## [571] "LOC734162"         "LOC496259"         "mrpl11"           
## [574] "LOC496258"         "mrpl13"            "LOC496046"        
## [577] "exosc4"            "LOC495942"         "LOC495666"        
## [580] "LOC495349"         "LOC495212"         "MGC85354"         
## [583] "mrpl53"            "MGC85374"          "rpl32"            
## [586] "MGC85232"          "MGC84749"          "eft-2-prov"       
## [589] "exosc7"            "exosc7-prov"       "mrpl1"            
## [592] "mrpl1-prov"        "MGC81028"          "rlp24"            
## [595] "rpl24l"            "rsl24d1"           "rvas3"            
## [598] "MGC82344"          "imp3"              "MGC81216"         
## [601] "MGC80065"          "rps5"              "mrps17"           
## [604] "mrps12"            "LOC398682"         "mrpl18"           
## [607] "mrps34"            "mrpl43"            "RPL18A"           
## [610] "pdcd9"             "S14"               "RACK1"
grep("rpl", rownames(xenopus), value = T) %>% sort()
##   [1] "mrpl1.L"   "mrpl1.S"   "mrpl10.S"  "mrpl11.L"  "mrpl11.S"  "mrpl12.L" 
##   [7] "mrpl13.S"  "mrpl14.L"  "mrpl15.L"  "mrpl16.S"  "mrpl17.S"  "mrpl18.L" 
##  [13] "mrpl19.S"  "mrpl2.L"   "mrpl20.L"  "mrpl21.L"  "mrpl22.L"  "mrpl23.S" 
##  [19] "mrpl24.L"  "mrpl27.L"  "mrpl28.L"  "mrpl3.L"   "mrpl3.S"   "mrpl30.S" 
##  [25] "mrpl32.L"  "mrpl32.S"  "mrpl33.L"  "mrpl35.L"  "mrpl36.L"  "mrpl37.L" 
##  [31] "mrpl38.L"  "mrpl39.L"  "mrpl4.L"   "mrpl40.S"  "mrpl41.L"  "mrpl41.S" 
##  [37] "mrpl42.L"  "mrpl42.S"  "mrpl43.L"  "mrpl44.S"  "mrpl45.L"  "mrpl46.L" 
##  [43] "mrpl46.S"  "mrpl47.L"  "mrpl48.S"  "mrpl49.L"  "mrpl51.L"  "mrpl52.L" 
##  [49] "mrpl52.S"  "mrpl53.L"  "mrpl54.S"  "mrpl55.L"  "mrpl57.S"  "mrpl58.L" 
##  [55] "mrpl58.S"  "mrpl9.S"   "rpl10.L"   "rpl10.S"   "rpl10a.S"  "rpl11.L"  
##  [61] "rpl11.S"   "rpl12.L"   "rpl12.S"   "rpl13.S"   "rpl13a.L"  "rpl13a.S" 
##  [67] "rpl14.S"   "rpl15.L"   "rpl15.S"   "rpl17.L"   "rpl17.S"   "rpl18.L"  
##  [73] "rpl18.S"   "rpl18a.L"  "rpl18a.S"  "rpl19.L"   "rpl21.L"   "rpl22.L"  
##  [79] "rpl22.S"   "rpl22l1.L" "rpl23.S"   "rpl23a.L"  "rpl23a.S"  "rpl24.L"  
##  [85] "rpl24.S"   "rpl26.L"   "rpl26.S"   "rpl27.L"   "rpl27.S"   "rpl27a.L" 
##  [91] "rpl27a.S"  "rpl28.L"   "rpl28.S"   "rpl29.L"   "rpl29.S"   "rpl3.L"   
##  [97] "rpl30.L"   "rpl30.S"   "rpl31.L"   "rpl31.S"   "rpl32.L"   "rpl34.L"  
## [103] "rpl34.S"   "rpl35.L"   "rpl35a.S"  "rpl36.L"   "rpl36.S"   "rpl36a.L" 
## [109] "rpl37.L"   "rpl37.S"   "rpl38.L"   "rpl38.S"   "rpl39.L"   "rpl39.S"  
## [115] "rpl4.L"    "rpl4.S"    "rpl5.L"    "rpl5.S"    "rpl6.L"    "rpl6.S"   
## [121] "rpl7.L"    "rpl7a.L"   "rpl7a.S"   "rpl7l1.L"  "rpl8.L"    "rpl8.S"   
## [127] "rpl9.L"    "rplp0.L"   "rplp0.S"   "rplp1.L"   "rplp2.L"   "rplp2.S"
ribo.genes <- intersect( rownames(xenopus), c(provisional, provisional2) )
ribo.genes
##   [1] "mrps26.L"     "ptcd3.L"      "mrpl35.L"     "rpl9.L"       "exosc9.L"    
##   [6] "rpl34.L"      "mrpl1.L"      "rpl36.L"      "eef2.1.L"     "LOC108712230"
##  [11] "rps15.L"      "rps28p9.L"    "lonp1.L"      "uba52.L"      "LOC108713416"
##  [16] "rps6.L"       "mrps18c.L"    "mrpl52.L"     "mvk.L"        "rplp0.L"     
##  [21] "rpl6.L"       "rps23.L"      "mrps27.L"     "gfm2.L"       "nsa2.L"      
##  [26] "mrps30.L"     "rpl37.L"      "rpl17.L"      "LOC108706570" "rpl34.S"     
##  [31] "mrpl1.S"      "rpl36.S"      "eef2.1.S"     "mrpl54.S"     "rps15.S"     
##  [36] "rps28p9.S"    "LOC108706905" "rps6.S"       "mrps18c.S"    "mrpl52.S"    
##  [41] "rplp0.S"      "mrpl40.S"     "rpl6.S"       "rps23.S"      "nsa2.S"      
##  [46] "mrps30.S"     "rpl37.S"      "rpl17.S"      "mrpl39.L"     "riox2.L"     
##  [51] "rpl23a.L"     "rps6ka3.L"    "LOC398653"    "rps6ka1.L"    "rps10.L"     
##  [56] "rpl8.L"       "mrps15.L"     "rpl11.L"      "rps6kb1.L"    "mrps17.L"    
##  [61] "rpl24.L"      "LOC108708365" "rpl31.L"      "mrps9.L"      "rps26.L"     
##  [66] "mrps31.L"     "exosc8.L"     "rpl21.L"      "rpl23a.S"     "mrps6.S"     
##  [71] "rps6ka3.S"    "XB5843130.S"  "rpl8.S"       "rps10.S"      "rpl10a.S"    
##  [76] "rps6ka6.S"    "LOC121400351" "rpl11.S"      "rps6kb1.S"    "rpl24.S"     
##  [81] "rpl31.S"      "mrpl30.S"     "rps26.S"      "mrps31.S"     "exosc8.S"    
##  [86] "MGC80700"     "mrpl48.S"     "mrps33.L"     "mrps35.L"     "mrpl22.L"    
##  [91] "rps14.L"      "rpl26.L"      "LOC108710993" "hsp90b1.L"    "rps16.L"     
##  [96] "mrpl42.L"     "rpl18a.L"     "rps27l.L"     "rps17.L"      "mrpl18.L"    
## [101] "rsl24d1.L"    "galk2.L"      "LOC108711439" "mrps11.L"     "mrpl46.L"    
## [106] "rpl4.L"       "mrpl4.L"      "mrps33.S"     "rps14.S"      "rpl4.S"      
## [111] "mrpl46.S"     "mrps11.S"     "LOC108712865" "LOC108712883" "rsl24d1.S"   
## [116] "rps17.S"      "rps27l.S"     "rpl18a.S"     "mrpl42.S"     "rps16.S"     
## [121] "hsp90b1.S"    "nhp2.S"       "rpl26.S"      "LOC108703484" "rps13.L"     
## [126] "ipo7.L"       "mrpl21.L"     "rplp2.L"      "mrpl49.L"     "fau.L"       
## [131] "rps6ka4.L"    "mrpl11.L"     "ogfod1.L"     "LOC121393045" "mvd.L"       
## [136] "exosc6.L"     "ffcskk.L"     "rps8.L"       "mrpl37.L"     "rpl5.L"      
## [141] "snu13.L"      "rpsa.L"       "imp3.L"       "rpl3.L"       "rps19bp1.L"  
## [146] "rpl29.L"      "rpl32.L"      "mrps25.L"     "qars1.L"      "LOC108714734"
## [151] "mrpl11.S"     "fau.S"        "rplp2.S"      "mrpl23.S"     "ipo7.S"      
## [156] "rps13.S"      "rpl13.S"      "rps8.S"       "rpl5.S"       "snu13.S"     
## [161] "rpsa.S"       "imp3.S"       "LOC108715766" "rpl29.S"      "LOC108715857"
## [166] "mrps25.S"     "qars1.S"      "rps27a.L"     "rps6kc1.L"    "eprs1.L"     
## [171] "mrpl33.L"     "LOC108716102" "mrpl14.L"     "mrps18a.L"    "mrpl2.L"     
## [176] "rpl7l1.L"     "mrps5.L"      "rps12.L"      "LOC121393773" "LOC121393781"
## [181] "mrpl47.L"     "rpl22l1.L"    "gfm1.L"       "mrps22.L"     "rps7.L"      
## [186] "hsp90ab1.S"   "eprs1.S"      "rps6kc1.S"    "mrpl57.S"     "rps27a.S"    
## [191] "pnpt1.S"      "LOC108717404" "LOC108717861" "mrps36.S"     "rps12.S"     
## [196] "LOC121393140" "LOC108705784" "rpl35a.S"     "mrpl44.S"     "rps7.S"      
## [201] "LOC108717492" "mrpl19.S"     "mrpl55.L"     "rpl15.L"      "top2b.L"     
## [206] "LOC121394610" "mrpl3.L"      "mrpl32.L"     "impact.L"     "mrpl15.L"    
## [211] "rps20.L"      "rpl7.L"       "LOC121394503" "mrpl53.L"     "rpl30.L"     
## [216] "rplp1.L"      "rpl15.S"      "top2b.S"      "rpl14.S"      "mrpl3.S"     
## [221] "exosc7.S"     "mrpl32.S"     "mlh1.S"       "rps20.S"      "LOC108695345"
## [226] "rpl30.S"      "mrpl13.S"     "MGC114621"    "exosc4.S"     "mrpl51.L"    
## [231] "mrpl43.L"     "LOC121393051" "rps24.L"      "rpl27a.L"     "XB5896631.L" 
## [236] "rps25.L"      "mrpl20.L"     "mrps16.L"     "rpl22.L"      "rpl18.L"     
## [241] "rpl28.L"      "rps9.L"       "rps19.L"      "mrto4.L"      "mrpl16.S"    
## [246] "LOC108697549" "rpl27a.S"     "rps25.S"      "rpl22.S"      "LOC108697688"
## [251] "rpl18.S"      "rpl28.S"      "rps9"         "rps9.S"       "rps19.S"     
## [256] "mrto4.S"      "rpl10.L"      "rpl12.L"      "mrps2.L"      "rpl7a.L"     
## [261] "rpl35.L"      "rpl36a.L"     "rps4x.L"      "rpl39.L"      "LOC108698451"
## [266] "ckm.L"        "rack1.L"      "rps5.L"       "rps18.L"      "mrps18b.L"   
## [271] "exosc5.L"     "mrps12.L"     "mlh3.L"       "rps6ka5.L"    "LOC108698757"
## [276] "yy1.L"        "LOC108698781" "mrps21.L"     "mrpl24.L"     "rps27.L"     
## [281] "dap3.L"       "rpl10.S"      "yy1.S"        "rps29.S"      "mrps10.S"    
## [286] "mrpl17.S"     "rpl12.S"      "mrps2.S"      "rpl7a.S"      "LOC108700022"
## [291] "rps4x.S"      "LOC108700056" "rpl39.S"      "LOC108700150" "rack1.S"     
## [296] "rps5.S"       "LOC108700218" "mrps18b.S"    "mrps12.S"     "mrps21.S"    
## [301] "rps27.S"      "mrpl9.S"      "top2a.L"      "rpl19.L"      "LOC108700787"
## [306] "mrpl45.L"     "eftud2.L"     "LOC108700925" "rpl38.L"      "mrpl12.L"    
## [311] "rpl27.L"      "mrpl27.L"     "mrps7.L"      "galk1.L"      "mrpl38.L"    
## [316] "rps11.L"      "rpl13a.L"     "mrpl28.L"     "rsl1d1.L"     "MGC114789"   
## [321] "pms2.L"       "trap1.L"      "rps2.L"       "rpl23.S"      "mrpl10.S"    
## [326] "eftud2.S"     "rps21.S"      "rpl38.S"      "rpl27.S"      "mrps7.S"     
## [331] "LOC108702869" "rpl13a.S"     "pms1.S"       "rsl1d1.S"     "rps15a.S"    
## [336] "mrps34.S"     "trap1.S"      "LOC108703225"
ribo.genes %>% length()
## [1] 338

Save the ribosomal genes that you curated

xenopus[["percent.ribo"]] <- Seurat::PercentageFeatureSet( xenopus, features = ribo.genes )
xenopus@meta.data

6.3 Hemoglobin genes

Depending on the situation it is helpful to have hemoglobin genes to recognize a particular cell types, the red blood cells. There is a typical convention of naming them, I have followed this to put manually, but make sure they are correct

rbc.genes <- 
setdiff( 
    grep("^hb", rownames(xenopus), value = T),
    grep("^hb(egf|ox)", rownames(xenopus), value = T)
)

rbc.genes
## [1] "hbp1.L"   "hbp1.S"   "hbs1l.L"  "hba-l5.L" "hba3.L"   "hbg1.L"   "hbg2.S"
xenopus[["percent.rbc"]] <- Seurat::PercentageFeatureSet( xenopus, features = rbc.genes )

Save the hemoglobin genes that you have curated

With these three additions, you have essentially meta information for each cells that you can check out in the Seurat object @meta.data slot:

xenopus@meta.data

We can do the visualization:

VlnPlot(
  xenopus, 
  features = c(
    "nFeature_RNA", 
    "nCount_RNA", 
    "percent.mt",
    "percent.ribo"
  ), 
  ncol = 4,
  pt.size = 0
)

VlnPlot(
  xenopus,
  features = c(
    "percent.rbc"
  )
)

## Cell-cycle genes

In Seurat tutorial, this is quite ad-hoc from one paper to determine the cell cycle genes:

data('cc.genes')

gene.list.frog  <- rownames(xenopus)

# cc.genes$s.genes   <- map( cc.genes$s.genes, simpleCap )
# cc.genes$g2m.genes <- map( cc.genes$g2m.genes, simpleCap )

# setdiff( tolower(cc.genes$s.genes), gene.list.frog ) # Mlf1ip is Cenpu

tolower(cc.genes$s.genes) %>% sort()
##  [1] "atad2"    "blm"      "brip1"    "casp8ap2" "ccne2"    "cdc45"   
##  [7] "cdc6"     "cdca7"    "chaf1b"   "clspn"    "dscc1"    "dtl"     
## [13] "e2f8"     "exo1"     "fen1"     "gins2"    "gmnn"     "hells"   
## [19] "mcm2"     "mcm4"     "mcm5"     "mcm6"     "mlf1ip"   "msh2"    
## [25] "nasp"     "pcna"     "pola1"    "pold3"    "prim1"    "rad51"   
## [31] "rad51ap1" "rfc2"     "rpa2"     "rrm1"     "rrm2"     "slbp"    
## [37] "tipin"    "tyms"     "ubr7"     "uhrf1"    "ung"      "usp1"    
## [43] "wdr76"
grep( 
    paste0( "^(", paste( tolower(cc.genes$s.genes), collapse = "|" ), ")" ),
    gene.list.frog,
    value = T
) %>% sort()
##  [1] "atad2.L"     "atad2.S"     "atad2b.L"    "atad2b.S"    "blm.S"      
##  [6] "blmh.L"      "blmh.S"      "brip1.L"     "ccne2.L"     "ccne2.S"    
## [11] "cdc45.L"     "cdc45.S"     "cdc6.L"      "cdc6.S"      "cdca7.L"    
## [16] "cdca7.S"     "cdca7l.S"    "clspn.S"     "dscc1.L"     "dtl.L"      
## [21] "dtl.S"       "e2f8.L"      "exo1.S"      "fen1.L"      "fen1.S"     
## [26] "gins2.L"     "gmnn.L"      "gmnn.S"      "hells.L"     "hells.S"    
## [31] "mcm2.L"      "mcm2.S"      "mcm4.L"      "mcm4.S"      "mcm5.L"     
## [36] "mcm5.S"      "mcm6.2.L"    "mcm6.2.S"    "msh2.L"      "nasp.L"     
## [41] "nasp.S"      "pcna.L"      "pcna.S"      "pola1.S"     "pold3.L"    
## [46] "prim1.S"     "rad51.L"     "rad51.S"     "rad51ap1.L"  "rad51c.L"   
## [51] "rad51d.L"    "rfc2.L"      "rpa2.L"      "rpa2.S"      "rrm1.L"     
## [56] "rrm1.S"      "rrm2.1.L"    "rrm2.2.L"    "rrm2b.S"     "slbp.L"     
## [61] "slbp.S"      "tipin.L"     "tipin.S"     "tyms.L"      "ubr7.L"     
## [66] "ubr7.S"      "uhrf1.L"     "uhrf1.S"     "uhrf1bp1l.L" "ung.L"      
## [71] "usp1.L"      "usp1.S"      "usp10.L"     "usp10.S"     "usp12.L"    
## [76] "usp12.S"     "usp12b.L"    "usp12b.S"    "usp13.L"     "usp14.L"    
## [81] "usp14.S"     "usp15.L"     "usp16.L"     "usp16.S"     "usp19.L"    
## [86] "usp19.S"     "wdr76.S"
map_dfr(
    cc.genes$s.genes,
    function(g) {
        tibble(
            gene = tolower(g),
            match = paste( grep( paste0( "^", tolower(g), "\\." ), gene.list.frog, value = T), collapse = "," )
        )
    }
)
map_dfr(
    cc.genes$g2m.genes,
    function(g) {
        tibble(
            gene = tolower(g),
            match = paste( grep( paste0( "^", tolower(g), "\\." ), gene.list.frog, value = T), collapse = "," )
        )
    }
)
# Two genes are not found
# casp8ap2
# mlf1ip - cenpu

grep("cenpu", gene.list.frog, value = T)
## [1] "cenpu.L"
# grep("ced-", gene.list.frog, value = T)  # https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-6257801

# fam64a
# ckap2l
# hjurp
# hn1
# cdca2
# psrc1

grep("pimreg", gene.list.frog, value = T) # fam64a
## [1] "pimreg.L"
grep("ckap2", gene.list.frog, value = T) # ckap2l https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-13579809 not present?
## [1] "ckap2.L" "ckap2.S"
grep("jpt1", gene.list.frog, value = T) # hn1
## [1] "jpt1.L" "jpt1.S"
grep("22068216", gene.list.frog, value = T) # https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-22068215
## character(0)
cc.genes.frog <- cc.genes
cc.genes.frog$s.genes <- c(
    grep( 
        paste0( "^(", paste( tolower(cc.genes$s.genes), collapse = "|" ), ")\\." ),
        gene.list.frog,
        value = T
    ),
    "cenpu.L"
)

cc.genes.frog$g2m.genes <- c(
    grep( 
        paste0( "^(", paste( tolower(cc.genes$g2m.genes), collapse = "|" ), ")\\." ),
        gene.list.frog,
        value = T
    ),
    "pimreg.L",
    "jpt1.L",
    "jpt1.S"
)

cc.genes.frog
## $s.genes
##  [1] "slbp.L"     "uhrf1.L"    "ung.L"      "cdc45.L"    "slbp.S"    
##  [6] "uhrf1.S"    "cdc45.S"    "rpa2.L"     "brip1.L"    "rfc2.L"    
## [11] "rrm1.L"     "pold3.L"    "pola1.S"    "rpa2.S"     "clspn.S"   
## [16] "prim1.S"    "rrm1.S"     "tipin.L"    "pcna.L"     "pcna.S"    
## [21] "blm.S"      "wdr76.S"    "tipin.S"    "e2f8.L"     "fen1.L"    
## [26] "gins2.L"    "usp1.L"     "nasp.L"     "mcm5.L"     "mcm2.L"    
## [31] "fen1.S"     "usp1.S"     "nasp.S"     "mcm5.S"     "mcm2.S"    
## [36] "msh2.L"     "dtl.L"      "rrm2.2.L"   "rrm2.1.L"   "dtl.S"     
## [41] "gmnn.L"     "tyms.L"     "mcm4.L"     "ccne2.L"    "dscc1.L"   
## [46] "atad2.L"    "gmnn.S"     "mcm4.S"     "ccne2.S"    "atad2.S"   
## [51] "hells.L"    "hells.S"    "rad51ap1.L" "ubr7.L"     "rad51.L"   
## [56] "ubr7.S"     "exo1.S"     "rad51.S"    "cdc6.L"     "mcm6.2.L"  
## [61] "cdca7.L"    "cdc6.S"     "mcm6.2.S"   "cdca7.S"    "cenpu.L"   
## 
## $g2m.genes
##  [1] "tacc3.L"   "hmgb2.L"   "cenpe.L"   "cks2.L"    "tacc3.S"   "hmgb2.S"  
##  [7] "cks2.S"    "ndc80.L"   "cdca8.L"   "cbx5.L"    "ckap2.L"   "ndc80.S"  
## [13] "cdca8.S"   "birc5.S"   "cbx5.S"    "ckap2.S"   "hmmr.L"    "cdc25c.L" 
## [19] "gas2l3.L"  "tmpo.L"    "gtse1.L"   "ccnb2.L"   "kif23.L"   "aurkb.L"  
## [25] "kif23.S"   "ccnb2.S"   "gtse1.S"   "tmpo.S"    "cdc25c.S"  "aurkb.S"  
## [31] "ckap5.L"   "ctcf.L"    "cdc20.L"   "kif2c.L"   "nuf2.L"    "rangap1.L"
## [37] "ctcf.S"    "cdc20.S"   "kif2c.S"   "nuf2.S"    "rangap1.S" "cenpf.L"  
## [43] "bub1.L"    "ect2.L"    "smc4.L"    "nek2.S"    "ect2.S"    "anln.L"   
## [49] "cdca3.L"   "cdk1.L"    "mki67.L"   "kif11.L"   "ncapd2.S"  "cdk1.S"   
## [55] "mki67.S"   "kif11.S"   "kif20b.S"  "tubb4b.L"  "cenpa.L"   "dlgap5.L" 
## [61] "nusap1.L"  "anp32e.L"  "cks1b.L"   "g2e3.S"    "dlgap5.S"  "tubb4b.S" 
## [67] "nusap1.S"  "anp32e.S"  "cks1b.S"   "top2a.L"   "aurka.L"   "tpx2.L"   
## [73] "ube2c.L"   "aurka.S"   "tpx2.S"    "ube2c.S"   "pimreg.L"  "jpt1.L"   
## [79] "jpt1.S"

Try out below:

# DefaultAssay(xenopus) <- "RNA"
# 
# xenopus <-
# CellCycleScoring(
#   xenopus,
#   s.features = unlist(cc.genes.frog$s.genes),
#   g2m.features = unlist(cc.genes.frog$g2m.genes),
#   set.ident = F
# )
# 
# xenopus@meta.data

Why is the error happening?

7 Investigating meta information (cell characteristics)

7.1 UMI to Gene relationship

xenopus@meta.data %>%
  dplyr::slice(sample(1:n()))  %>%  # just to avoid any artificial "clumping" because of the library
  ggplot( aes(x = nCount_RNA, y = nFeature_RNA)) +
#  ggplot( aes(x = nUMI, y = nGene, colour=percent.mito)) +
  geom_point( alpha = 0.5 ) + 
  geom_smooth() +
  # geom_hline( yintercept = 500, linetype = "dashed", colour="salmon" ) +
  # geom_hline( yintercept = 300, linetype = "dashed", colour="blue" ) +
  scale_x_continuous( 
    labels = scales::comma 
  ) +
  scale_y_continuous(labels = scales::comma) +
#  facet_wrap( library ~ . , scale = "free_x") +
  # scale_color_hue(name = "mitochondrial content", 
  #                 labels = c(">25%", "<=25%")) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

This looks OK, with some large cells (in terms of RNA content) spread out. around UMI=2500 threshold..

There can be also cells that have lower gene content than the rest, not in this dataset, but showing one example here:

Example UMI Gene relationship

xenopus@meta.data %>%
  mutate(
    call = case_when(
      percent.mt > 20 ~ "mito > 20%",
      percent.rbc > 1 ~ "rbc > 1%",
      TRUE ~ "pass"
    )
  ) %>%
  dplyr::slice(sample(1:n())) %>%   # just to avoid any artificial "clumping" because of the library
  ggplot( aes(x = nCount_RNA, y = nFeature_RNA, colour=call)) +
#  ggplot( aes(x = nUMI, y = nGene, colour=percent.mito)) +
  geom_point( alpha = 0.5 ) + 
  geom_hline( yintercept = 1000, linetype = "dashed", colour="salmon" ) +
  scale_x_continuous( 
    labels = scales::comma 
  ) +
  scale_y_continuous(labels = scales::comma) +
  scale_color_manual(
    name = "outliers",
    values = c(
      "mito > 20%" = "blue",
      "rbc > 1%" = "salmon",
      "pass" =  "grey"
    )
  ) +
  ggtitle(
    "Stable relationship between nUMI and nGene"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Again, the same different example showing cases where RBC contamination is strong. Example of RBC contamination in the dataset

7.2 Mitochondrial content

xenopus@meta.data %>%
  dplyr::slice(sample(1:n())) %>%   # just to avoid any artificial "clumping" because of the library
  mutate(
    state = case_when(
      percent.mt > 20 ~ "mito(20%+)",
      percent.mt > 5 ~ "mito(5%+)",
      percent.rbc  > 1 ~ "rbc(1%+)",
      TRUE ~ "pass"
    )
  ) %>%
  ggplot( aes(x = nFeature_RNA, y = percent.mt, colour=state)) +
  geom_point( alpha = 0.5 ) + 
  geom_hline( yintercept = 1, linetype = "dashed", colour="salmon" ) + 
  geom_hline( yintercept = 5, linetype = "dashed", colour="salmon" ) + 
  geom_hline( yintercept = 20, linetype = "dashed", colour="navy" ) + 
  geom_vline( xintercept = 1000, linetype = "dashed", colour="blue" ) +
  geom_vline( xintercept = 2500, linetype = "dashed", colour="salmon" ) +
  scale_x_continuous( 
    breaks = c(0, 1000, 2000, 4000, 6000, 8000, 10000, 20000, 30000), 
    labels = scales::comma 
  ) +
  scale_color_manual(
      name = "outliers",
      values = c(
          "mito(20%+)" = "navy", 
          "mito(5%+)" = "green",
          "pass" = "grey", 
          "rbc(1%+)" = "salmon"
      )
  ) +
  ggtitle("Exploration of potential dead cells",
          paste0(
            "Clear inverse relationship between # of genes and mito content,\n",
            "for low nGene small cells"
          )
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

## Ribosmal content

colnames( xenopus@meta.data )
## [1] "orig.ident"   "nCount_RNA"   "nFeature_RNA" "percent.mt"   "percent.ribo"
## [6] "percent.rbc"
xenopus@meta.data %>%
  dplyr::filter( percent.mt < 20 ) %>%
  ggplot( aes( x = orig.ident, y = percent.ribo ) ) +
  geom_violin( scale="width", trim = TRUE )

xenopus@meta.data %>%
  ggplot( aes( x = percent.mt, y = percent.ribo ) ) +
  geom_point( alpha = 0.2 ) +
  # theme(
  #   axis.text.x = element_text( angle = 45, hjust = 1 )
  # ) +
#  coord_flip() +
  ggtitle("Relationship between ribosomal and mitochondrial content")

8 Standard Pre-processing workflow

8.1 Normalization

xenopus <- NormalizeData( 
  xenopus,
  normalization.method = "LogNormalize",
  scale.factor = 10000
)

8.2 Identification of highly variable features (feature selection)

xenopus <- FindVariableFeatures(
  xenopus,
  selection.method = "vst",
  nfeatures = 2000
)
# Identify the 10 most highly variable genes
top10 <- head(VariableFeatures(xenopus), 10)
top10
##  [1] "LOC108714608" "LOC121398924" "LOC121393091" "LOC121394899" "itln1.L"     
##  [6] "otogl2.L"     "gfus.L"       "XB5922676.S"  "LOC108702929" "gpx3.S"
# plot variable features with and without labels
plot1 <- VariableFeaturePlot(xenopus)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
## When using repel, set xnudge and ynudge to 0 for optimal results
plot1 + plot2

top10
##  [1] "LOC108714608" "LOC121398924" "LOC121393091" "LOC121394899" "itln1.L"     
##  [6] "otogl2.L"     "gfus.L"       "XB5922676.S"  "LOC108702929" "gpx3.S"

8.3 Scaling the data

Next, we apply a linear transformation (‘scaling’) that is a standard pre-processing step prior to dimensional reduction techniques like PCA. The ScaleData() function:

  • Shifts the expression of each gene, so that the mean expression across cells is 0
  • Scales the expression of each gene, so that the variance across cells is 1 ** This step gives equal weight in downstream analyses, so that highly-expressed genes do not dominate
  • The results of this are stored in xenopus[["RNA"]]@scale.data
# Conduct scaling to everything
tictoc::tic()
xenopus <- ScaleData(
  xenopus, 
  features = rownames(xenopus),
  vars.to.regress = "percent.mt"
)
## Regressing out percent.mt
## Centering and scaling data matrix
tictoc::toc()
## 397.395 sec elapsed

8.4 Perform linear dimensional reduction (principal component)

tictoc::tic()
xenopus <- RunPCA(xenopus, features = VariableFeatures(object = xenopus))
## PC_ 1 
## Positive:  XB5922676.S, tuba1cl.3.L, LOC121397893, LOC121394899, LOC108702929, dynlrb2.S, tuba1cl.2.S, LOC108714608, tekt2.S, ccdc63.L 
##     dnali1.S, LOC121393091, LOC100137623, ak1.L, ak1.S, LOC108720114, LOC108719383, odf3.L, tubb4b.L, tuba1cl.2.L 
##     meig1.S, odf3.S, dynlt2b.L, fam166c.S, enkur.L, cfap45.S, LOC121399427, selenow2.L, rsph1.L, pifo.S 
## Negative:  LOC108709680, tmsb4x.L, XB22164552.S, pfn1.L, krt19.L, LOC108698272, ly6g6c.L, s100a10.S, s100a10.L, slit2.S 
##     LOC121395537, anxa1.2.S, acta2.S, mmp1.S, mt4.L, ets2.L, marcks.L, LOC121398924, LOC121395448, cldn6.2.S 
##     arpc5.S, olfm4.L, acta2.L, COX3, mmp8.S, LOC108699645, fkbp11.L, btg2.S, LOC108697796, XB22065621.L 
## PC_ 2 
## Positive:  atp6v0c.S, atp6v1al.L, atp6v0c.L, ca2.S, atp6v1g3.L, atp6v1g3.S, atp6v1b1.L, ca2.L, atp6v0e1.L, txn.L 
##     slc26a4l.L, atp6v0d2.L, atp6v0b.S, atp6v1d.S, atp6v1f.L, atp6ap1.1.S, atp6v0e1.S, hspe1.L, cycs.S, atp6ap1.1.L 
##     foxi1.L, LOC398702, cycs.L, hspe1.S, atp6v0a4.L, gsta1.S, wfdc2.L, cox7a2.L, fth1.1.S, atp5mc3.S 
## Negative:  LOC108709680, tmsb4x.L, krt19.L, pfn1.L, LOC108698272, ly6g6c.L, XB22164552.S, s100a10.S, slit2.S, s100a10.L 
##     anxa1.2.S, LOC121395537, acta2.S, tuba1cl.3.L, mmp1.S, gby.L, LOC108702929, btg2.S, cfap45.S, LOC121397893 
##     XB5922676.S, tuba1cl.2.S, LOC121394899, LOC121395448, ets2.L, LOC121393091, acta2.L, LOC108714608, ccdc63.L, olfm4.L 
## PC_ 3 
## Positive:  tmsb4x.L, krt19.L, pfn1.L, atp6v1g3.L, ca2.L, atp6v1al.L, atp6v1g3.S, azin2.S, atp6v1b1.L, ca2.S 
##     LOC108709680, atp6v0c.L, atp6v0b.S, atp6v0d2.L, atp6v1d.S, slc26a4l.L, atp6v0c.S, atp6ap1.1.S, atp6v0e1.S, foxi1.L 
##     atp6v1f.L, atp6ap1.1.L, atp6v0a4.L, LOC108719387, cystm1.S, tbc1d24.2.L, hspe1.S, atp5mc3.S, LOC108704370, cycs.S 
## Negative:  otogl2.L, LOC108699763, fucolectin.S, LOC108719453, itln1.L, LOC108696889, sult6b1.5.L, atp12a.L, LOC108696890, MGC68910 
##     atp1b2.S, LOC108697896, XB5953580.L, ldhb.S, LOC108700425, agr2.L, tll2l.L, LOC121397762, MGC84752, sytl1.S 
##     upk1a.S, crisp1.7.L, LOC108699644, XB5774338.L, LOC100158288, psca.S, LOC108699649, capn9.L, upk3a.L, gfus.L 
## PC_ 4 
## Positive:  LOC108697796, LOC121398924, ano1.L, camk1.L, kcna4.S, LOC108713813, mal2.S, mal.L, foxa1.L, pou3f1.S 
##     emx2.L, spdef.S, atp1b1.L, fkbp11.L, LOC108696980, galnt6.2.L, gpx3.S, dut.S, elapor1.S, LOC108697876 
##     pts.L, rtp3a.2.L, ATP6, elovl7.S, krt18.1.S, ND3, nans.S, sars1.S, XB5717875.L, LOC108696984 
## Negative:  tmsb4x.L, ly6g6c.L, XB22164552.S, krt19.L, pfn1.L, LOC108709680, LOC108698272, s100a10.S, s100a10.L, azin2.S 
##     LOC121395537, anxa1.2.S, mmp1.S, slit2.S, mt4.L, acta2.S, cldn6.2.S, btg2.S, arpc5.S, LOC108699649 
##     LOC121397602, LOC121395448, LOC108699645, ets2.L, olfm4.L, ctnnb1.L, atp1b2.S, acta2.L, aldob.L, XB5953580.L 
## PC_ 5 
## Positive:  gpx3.S, krt18.1.S, XB22065621.L, krt18.1.L, marcks.L, fn1.S, marcks.S, actc1.S, tnn.L, hoxc10.L 
##     LOC108709895, prmt1.L, XB5768883.L, pcdh8.2.L, rdd4.L, vim.L, XB5733233.S, LOC108696924, mycn.L, actc1.L 
##     cyyr1.L, fzd7.L, LOC108701391, cst3.L, col2a1.L, atp5mc3.L, LOC108704022, marcksl1.S, twist1.L, atp5mc3.S 
## Negative:  LOC121398924, LOC108698272, LOC108697796, camk1.L, ano1.L, LOC108709680, kcna4.S, slc26a4.3.S, LOC108713813, mal2.S 
##     slc26a4.3.L, XB22164552.S, ca12.L, fetub.S, slc16a3.L, atp1b1.L, mal.L, s100a10.S, foxa1.L, LOC108703568 
##     krt19.L, ndfip2.L, emx2.L, spdef.S, pfn1.L, galnt6.2.L, atp6v1b2.S, pou3f1.S, LOC108697862, psca.L
tictoc::toc()
## 11.219 sec elapsed
# Examine and visualize PCA results a few different ways
print(xenopus[["pca"]], dims = 1:5, nfeatures = 5)
## PC_ 1 
## Positive:  XB5922676.S, tuba1cl.3.L, LOC121397893, LOC121394899, LOC108702929 
## Negative:  LOC108709680, tmsb4x.L, XB22164552.S, pfn1.L, krt19.L 
## PC_ 2 
## Positive:  atp6v0c.S, atp6v1al.L, atp6v0c.L, ca2.S, atp6v1g3.L 
## Negative:  LOC108709680, tmsb4x.L, krt19.L, pfn1.L, LOC108698272 
## PC_ 3 
## Positive:  tmsb4x.L, krt19.L, pfn1.L, atp6v1g3.L, ca2.L 
## Negative:  otogl2.L, LOC108699763, fucolectin.S, LOC108719453, itln1.L 
## PC_ 4 
## Positive:  LOC108697796, LOC121398924, ano1.L, camk1.L, kcna4.S 
## Negative:  tmsb4x.L, ly6g6c.L, XB22164552.S, krt19.L, pfn1.L 
## PC_ 5 
## Positive:  gpx3.S, krt18.1.S, XB22065621.L, krt18.1.L, marcks.L 
## Negative:  LOC121398924, LOC108698272, LOC108697796, camk1.L, ano1.L
VizDimLoadings(xenopus, dims = 1:2, reduction = "pca")

ElbowPlot(xenopus)

xenopus <- FindNeighbors(xenopus, dims = 1:5)
## Computing nearest neighbor graph
## Computing SNN
xenopus <- FindClusters(xenopus, resolution = 0.1)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 4969
## Number of edges: 148042
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9605
## Number of communities: 6
## Elapsed time: 0 seconds
# If you haven't installed UMAP, you can do so via reticulate::py_install(packages =
# 'umap-learn')
xenopus <- RunUMAP(xenopus, dims = 1:5)
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
## 20:22:06 UMAP embedding parameters a = 0.9922 b = 1.112
## 20:22:06 Read 4969 rows and found 5 numeric columns
## 20:22:06 Using Annoy for neighbor search, n_neighbors = 30
## 20:22:06 Building Annoy index with metric = cosine, n_trees = 50
## 0%   10   20   30   40   50   60   70   80   90   100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 20:22:07 Writing NN index file to temp file /var/folders/qp/vf8kcj3d33q8rcd916m21zwh0000gn/T//RtmpXBfLCJ/fileb52b2ddf1fdc
## 20:22:07 Searching Annoy index using 1 thread, search_k = 3000
## 20:22:09 Annoy recall = 100%
## 20:22:09 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 20:22:10 Initializing from normalized Laplacian + noise (using irlba)
## 20:22:10 Commencing optimization for 500 epochs, with 195330 positive edges
## 20:22:17 Optimization finished
DimPlot(xenopus, reduction = "umap")

xenopus@meta.data
# install fast differential gene expression
devtools::install_github("immunogenomics/presto")

8.5 Quick survey and annotation based on differentially expressed marker expression

DefaultAssay( xenopus ) <- "RNA"

presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
group_by( group ) %>%
arrange( group, pct_out - pct_in) %>%
#filter( row_number() <= 5 ) %>%
dplyr::filter( group == 2 )
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
  dplyr::filter( grepl("txn.L", feature)) # ionocyte = cluster == 1
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
  dplyr::arrange( desc(auc) ) %>%
  dplyr::filter( grepl("foxa1", feature)) # small secretory cell = cluster == 2
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
  dplyr::filter( feature == "itln1.L") # goblet cells = cluster == 3
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
  dplyr::filter( grepl("tekt2.S", feature)) # multi ciliated cell = cluster == 4
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
  dplyr::filter( grepl("gpx3.S", feature)) # basal = cluster == 5

Traditional way of finding marker genes (as an example):

# find all markers of cluster 2
cluster5.markers <- FindMarkers(xenopus, ident.1 = 5, min.pct = 0.25)
## For a more efficient implementation of the Wilcoxon Rank Sum Test,
## (default method for FindMarkers) please install the limma package
## --------------------------------------------
## install.packages('BiocManager')
## BiocManager::install('limma')
## --------------------------------------------
## After installation of limma, Seurat will automatically use the more 
## efficient implementation (no further action necessary).
## This message will be shown once per session
head(cluster5.markers, n = 5)

The below is a boiler plate to check the expression pattern

FeaturePlot(
  xenopus,
  features = c("gpx3.S", "prmt1.S"),
  order = T
)

# new.cluster.ids <- c(
#   1 = "ionocyte",
#   2 = "small secretory",
#   3 = "goblet",
#   4 = "multi-ciliated",
#   5 = "basal"
# )

xenopus$annotation <- case_when(
  xenopus$RNA_snn_res.0.1 == 1 ~ "ionocyte",
  xenopus$RNA_snn_res.0.1 == 2 ~ "small_secretory",
  xenopus$RNA_snn_res.0.1 == 3 ~ "goblet",
  xenopus$RNA_snn_res.0.1 == 4 ~ "multi_ciliated",
  xenopus$RNA_snn_res.0.1 == 5 ~ "basal",
  xenopus$RNA_snn_res.0.1 == 0 ~ "early_epithelial_progenitor",
  TRUE ~ "ambiguous"
)

DimPlot(
  xenopus,
  group.by = "annotation",
  label = T
) +
  theme( legend.position = "bottom" )

grep("dll", rownames(xenopus), value = T)
## [1] "dll1.L" "dll1.S"
FeaturePlot(
  xenopus,
  features = c(
#    "htr3a.L", # serotonin receptor
    "notch2.L",
    "notch2.S",
    "notch1.L",
    "notch1.S"
  ),
  order = T
)

FeaturePlot(
  xenopus,
  features = c(
    "dll1.L",
    "dll1.S"
  ),
  order = T
)

You save any variable that is important. Explicitly selecting variables that will be kept for future will be helpful rather than saving the current environment as is.

tictoc::tic()
saveRDS( xenopus, file=glue::glue("{project.prefix}xenopusobject.rds"))
tictoc::toc()
## 33.331 sec elapsed

9 Deconvolution

9.1 Deconvolution of Bulk RNA-seq data from scRNA-seq (CIBERSORTx)

9.1.1 Check bulk RNA-seq dataset

Check first what we have for the bulk RNA-seq dataset:

cpm <- read_tsv("../ChungKwon2014.XENLA_rfx2mo_exp/Chung2014.cpm_table.tsv")
## Rows: 42675 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (1): ID
## dbl (4): ctrlA, ctrlB, rfx2moA, rfx2moB
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cpm

Need to check whether the gene names are inter-operatable.

intersect( rownames(xenopus), cpm$ID ) %>% length()
## [1] 17509
setdiff( cpm$ID, rownames(xenopus) ) %>% length()
## [1] 25166
setdiff( rownames(xenopus), cpm$ID ) %>% length()
## [1] 5

There are 5 genes that are present in the scRNA-seq data based on XenLae10.1 that is NOT present in the CPM data:

setdiff( rownames(xenopus), cpm$ID )
## [1] "XFG 5-1"            "ccdc50.L-1"         "unassigned-gene-2" 
## [4] "unassigned-gene-4"  "unassigned-gene-23"

9.1.2 Generation reference matrix for CIBERSORTx

Shall we “normalize” the number of references across annotations? This might be necessary if there is very unqual distribution of gene expression.

xenopus@meta.data %>%
  dplyr::count( annotation )

Except for the early epithelial progenitor annotation, most have comparable number of cells, so will not normalize the reference.

# Extract the count matrix from the scRNA-seq
tictoc::tic()
reference <- as.data.frame(as.matrix( GetAssayData( xenopus, slot = "counts" ) ))
tictoc::toc()
## 5.09 sec elapsed
ncol(reference)
## [1] 4969
nrow(xenopus@meta.data)
## [1] 4969
reference <- reference[intersect( rownames(xenopus), cpm$ID ), ]

reference %>% rownames_to_column("Genesymbol") %>% 
  dplyr::select( "Genesymbol", everything() ) %>%
  write_tsv( 
    file= glue::glue("{project.prefix}reference.txt")
  )
tictoc::toc()
# tictoc::tic()
# write.table(
#   reference, 
#   file=glue::glue("{project.prefix}reference.txt"), 
#   sep = "\t", 
#   quote=FALSE, 
#   row.names = TRUE,
#   col.names = TRUE
# )
# tictoc::toc()

Upload the reference file (that has the Genesymbol column) and upload as a single cell reference matrix file.

Uploading your scRNA-seq count matrix 1

Uploading your scRNA-seq count matrix 2 Then generate the signature matrix.

Running to create a signature matrix If it works well, it will generate a heat table, representative matrix.

Result of signature matrix generation

Then you can conduct the cell fraction inferences:

Plot title. That will result in estimation of the cell type fractions that you can download.

Plot title.

You can notice that the controls have about 18% of multi-ciliated cells, and some fractions for goblets, but in the morpholino case, these cell populations are gone.

9.2 Deconvolution of organisms

Show the slides.

10 Building your own gene sets (Gene Set Enrichment Analysis)

See the Youtube video! (Enrichr)